Artificial Intelligence in Ocular Transcriptomics: Applications of Unsupervised and Supervised Learning
Abstract
1. Introduction
2. Transcriptomic Modalities
2.1. Microarray
2.2. Bulk RNA-Seq
2.3. scRNA-Seq
3. Artificial Intelligence Approaches
3.1. Unsupervised Machine Learning
3.1.1. Shallow Methods: PCA, Clustering, WGCNA
3.1.2. Deep Methods: Autoencoders, scVI
3.2. Supervised Learning
3.2.1. Shallow Methods: SVM, RF, LASSO
3.2.2. Deep Methods: Neural Networks
4. AI Applications in Ocular Diseases and Retinal Development
4.1. Corneal Disease
4.2. Acute Macular Degeneration
4.3. Retinal Development
4.4. Diabetic Retinopathy
4.5. Glaucoma
4.6. Thyroid Eye Disease
4.7. Posterior Capsule Opacification
5. Challenges and Limitations
6. Future Directions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
AI | Artificial intelligence |
AMD | Age-related macular degeneration |
AUCs | Areas Under the Curve |
BaySeq | Bayesian Sequencing |
CIBERSORT | Cell type Identification By Estimating Relative Subsets Of RNA Transcripts |
DAVID | Database for Annotation, Visualization, and Integrated Discovery |
DEGs | Differentially expressed genes |
DESeq2 | Differential Expression analysis based on the Negative Binomial distribution (version 2) |
DL | Deep learning |
DME | Diabetic macular edema |
DNN | Deep neural networks |
DR | Diabetic retinopathy |
DS-NMF | Deep Subspace Nonnegative Matrix Factorization |
ECM | Extracellular matrix |
edgeR | Empirical Analysis of Digital Gene Expression Data in R |
GO | Gene Ontology |
GSEA | Gene Set Enrichment Analysis |
GSVA | Gene Set Variation Analysis |
GWAS | Genome-Wide Association Study Analysis |
iPSC | Induced pluripotent stem cells |
KCS | Keratoconjunctivitis sicca |
LASSO | Least absolute shrinkage and selection operator |
LIGER | Linked Inference of Genomic Experimental Relationships |
LIME | Local Interpretable Model–Agnostic Explanations |
limma | Linear Models for Microarray Data |
MAGIC | Markov Affinity-based Graph Imputation of Cells |
MCODE | Molecular Complex Detection |
ML | Machine learning |
NMF | Negative matrix factorization |
NPDR | Non-proliferative diabetic retinopathy |
OCT | Optical coherence tomography |
PCA | Principal component analysis |
PDMS | Polydimethylsiloxane |
PDR | Proliferative diabetic retinopathy |
POAG | Primary open-angle glaucoma |
PVR | Proliferative vitreoretinopathy |
QBAM | Quantitative brightfield absorbance microscopy |
RF | Random forest |
RGCs | Retinal ganglion cells |
RNA-seq | RNA sequencing |
RPE | Retinal pigment epithelium |
SCCAF | Single-Cell Clustering Assessment Framework |
ScGPS | Single-Cell Global fate Potential of Subpopulations |
ScRNA-seq | Single-cell RNA sequencing |
scVI | Single-cell variational inference |
SHAP | SHapley Additive exPlanations |
SVM | Support vector machine |
TED | Thyroid eye disease |
TEMPO | Tracing Expression of Multiple Protein Origins |
TNF | Tumor Necrosis Factor |
t-SNE | t-distributed Stochastic Neighbor Embedding |
UMAP | Uniform Manifold Approximation and Projection |
VAE | Variational autoencoders |
VEGF | Vascular endothelial growth factor |
v-SVR | Support vector regression (SVR) |
WGCNA | Weighted gene co-expression network analysis |
XGBoost | eXtreme Gradient Boosting |
References
- Sinn, R.; Wittbrodt, J. An eye on eye development. Mech. Dev. 2013, 130, 347–358. [Google Scholar] [CrossRef]
- Chow, R.L.; Lang, R.A. Early eye development in vertebrates. Annu. Rev. Cell Dev. Biol. 2001, 17, 255–296. [Google Scholar] [CrossRef]
- Miesfeld, J.B.; Brown, N.L. Eye organogenesis: A hierarchical view of ocular development. Curr. Top. Dev. Biol. 2019, 132, 351–393. [Google Scholar]
- Zuber, M.E.; Gestri, G.; Viczian, A.S.; Barsacchi, G.; Harris, W.A. Specification of the vertebrate eye by a network of eye field transcription factors. Development 2003, 130, 5155–5167. [Google Scholar] [CrossRef]
- Vöcking, O.; Famulski, J.K. Single cell transcriptome analyses of the developing zebrafish eye—Perspectives and applications. Front. Cell Dev. Biol. 2023, 11, 1213382. [Google Scholar] [CrossRef]
- Voigt, A.P.; Whitmore, S.S.; Mulfaul, K.; Chirco, K.R.; Giacalone, J.C.; Flamme-Wiese, M.J.; Stockman, A.; Stone, E.M.; Tucker, B.A.; Scheetz, T.E.; et al. Bulk and single-cell gene expression analyses reveal aging human choriocapillaris has pro-inflammatory phenotype. Microvasc. Res. 2020, 131, 104031. [Google Scholar] [CrossRef]
- Hack, S.J.; Petereit, J.; Tseng, K.A.-S. Temporal Transcriptomic Profiling of the Developing Xenopus laevis Eye. Cells 2024, 13, 1390. [Google Scholar] [CrossRef]
- Lukowski, S.W.; Lo, C.Y.; Sharov, A.A.; Nguyen, Q.; Fang, L.; Hung, S.S.; Zhu, L.; Zhang, T.; Grünert, U.; Nguyen, T.; et al. A single-cell transcriptome atlas of the adult human retina. EMBO J. 2019, 38, e100811. [Google Scholar] [CrossRef]
- Voigt, A.P.; Mulfaul, K.; Mullin, N.K.; Flamme-Wiese, M.J.; Giacalone, J.C.; Stone, E.M.; Tucker, B.A.; Scheetz, T.E.; Mullins, R.F. Single-cell transcriptomics of the human retinal pigment epithelium and choroid in health and macular degeneration. Proc. Natl. Acad. Sci. USA 2019, 116, 24100–24107. [Google Scholar] [CrossRef]
- Liang, Q.; Cheng, X.; Wang, J.; Owen, L.; Shakoor, A.; Lillvis, J.L.; Zhang, C.; Farkas, M.; Kim, I.K.; Li, Y.; et al. A multi-omics atlas of the human retina at single-cell resolution. Cell Genom. 2023, 3, 100298. [Google Scholar] [CrossRef]
- Jackson, V.; Wu, Y.; Bonelli, R.; Owen, J.; Scott, L.; Farashi, S.; Kihara, Y.; Gantner, M.L.; Egan, C.; Williams, K.M.; et al. Multi-omic spatial effects on high-resolution AI-derived retinal thickness. Nat. Commun. 2025, 16, 1317. [Google Scholar] [CrossRef]
- Suo, L.; Dai, W.; Qin, X.; Li, G.; Zhang, D.; Cheng, T.; Yao, T.; Zhang, C. Screening of primary open-angle glaucoma diagnostic markers based on immune-related genes and immune infiltration. BMC Genom. Data 2022, 23, 67. [Google Scholar] [CrossRef]
- Liu, J.; Li, X.; Cheng, Y.; Liu, K.; Zou, H.; You, Z. Identification of potential ferroptosis-related biomarkers and a pharmacological compound in diabetic retinopathy based on machine learning and molecular docking. Front. Endocrinol. 2022, 13, 988506. [Google Scholar] [CrossRef]
- Ma, Q.; Hai, Y.; Shen, J. Signatures of Six Autophagy-Related Genes as Diagnostic Markers of Thyroid-Associated Ophthalmopathy and Their Correlation with Immune Infiltration. Immun. Inflamm. Dis. 2024, 12, e70093. [Google Scholar] [CrossRef]
- Owen, N.; Moosajee, M. RNA-sequencing in ophthalmology research: Considerations for experimental design and analysis. Ther. Adv. Ophthalmol. 2019, 11, 251584141983546. [Google Scholar] [CrossRef]
- Wang, J.-H.; Wong, R.C.; Liu, G.-S. Retinal aging transcriptome and cellular landscape in association with the progression of age-related macular degeneration. Investig. Ophthalmol. Vis. Sci. 2023, 64, 32. [Google Scholar] [CrossRef]
- Wang, J.-H.; Wong, R.C.B.; Liu, G.-S. Retinal Transcriptome and Cellular Landscape in Relation to the Progression of Diabetic Retinopathy. Investig. Ophthalmol. Vis. Sci. 2022, 63, 26. [Google Scholar] [CrossRef]
- Yang, T.; Zhang, N.; Yang, N. Single-cell sequencing in diabetic retinopathy: Progress and prospects. J. Transl. Med. 2025, 23, 49. [Google Scholar] [CrossRef]
- Ahsanuddin, S.; Wu, A.Y. Single-cell transcriptomics of the ocular anterior segment: A comprehensive review. Eye 2023, 37, 3334–3350. [Google Scholar] [CrossRef]
- Wang, D.; Pu, Y.; Tan, S.; Wang, X.; Zeng, L.; Lei, J.; Gao, X.; Li, H. Identification of immune-related biomarkers for glaucoma using gene expression profiling. Front. Genet. 2024, 15, 1366453. [Google Scholar] [CrossRef]
- Wu, X.; Deng, Q.; Han, Z.; Ni, F.; Sun, D.; Xu, Y. Screening and identification of genes related to ferroptosis in keratoconus. Sci. Rep. 2023, 13, 13956. [Google Scholar] [CrossRef]
- Cai, Y.; Zhou, T.; Cai, X.; Shi, W.; Sun, H.; Fu, Y. Deciphering mitochondrial dysfunction in keratoconus: Insights into ACSL4 from machine learning-based bulk and single-cell transcriptome analyses and experimental validation. Comput. Struct. Biotechnol. J. 2025, 27, 1962–1974. [Google Scholar] [CrossRef]
- Kuchroo, M.; DiStasio, M.; Song, E.; Calapkulu, E.; Zhang, L.; Ige, M.; Sheth, A.H.; Majdoubi, A.; Menon, M.; Tong, A.; et al. Single-cell analysis reveals inflammatory interactions driving macular degeneration. Nat. Commun. 2023, 14, 2589. [Google Scholar] [CrossRef]
- Zhang, S.; Yang, Y.; Chen, J.; Su, S.; Cai, Y.; Yang, X.; Sang, A. Integrating Multi-omics to Identify Age-Related Macular Degeneration Subtypes and Biomarkers. J. Mol. Neurosci. 2024, 74, 74. [Google Scholar] [CrossRef]
- Schaub, N.J.; Hotaling, N.A.; Manescu, P.; Padi, S.; Wan, Q.; Sharma, R.; George, A.; Chalfoun, J.; Simon, M.; Ouladi, M.; et al. Deep learning predicts function of live retinal pigment epithelium from quantitative microscopy. J. Clin. Investig. 2020, 130, 1010–1023. [Google Scholar] [CrossRef]
- Lu, C.; Mao, X.; Yuan, S. Decoding physiological and pathological roles of innate immune cells in eye diseases: The perspectives from single-cell RNA sequencing. Front. Immunol. 2024, 15, 1490719. [Google Scholar] [CrossRef]
- Syta, A.; Podkowiński, A.; Chorągiewicz, T.; Karpiński, R.; Gęca, J.; Wróbel-Dudzińska, D.; Jonak, K.E.; Głuchowski, D.; Maciejewski, M.; Rejdak, R.; et al. Machine learning-assisted early detection of keratoconus: A comparative analysis of corneal topography and biomechanical data. Sci. Rep. 2025, 15, 24399. [Google Scholar] [CrossRef] [PubMed]
- Ting, D.S.W.; Pasquale, L.R.; Peng, L.; Campbell, J.P.; Lee, A.Y.-Y.; Raman, R.; Tan, G.S.W.; Schmetterer, L.; Keane, P.A.; Wong, T.Y. Artificial intelligence and deep learning in ophthalmology. Br. J. Ophthalmol. 2019, 103, 167–175. [Google Scholar] [CrossRef]
- Hogarty, D.T.; Mackey, D.A.; Hewitt, A.W. Current state and future prospects of artificial intelligence in ophthalmology: A review. Clin. Exp. Ophthalmol. 2019, 47, 128–139. [Google Scholar] [CrossRef]
- Schena, M.; Shalon, D.; Davis, R.W.; Brown, P.O. Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray. Science 1995, 270, 467–470. [Google Scholar] [CrossRef]
- Lockhart, D.J.; Winzeler, E.A. Genomics, gene expression and DNA arrays. Nature 2000, 405, 827–836. [Google Scholar] [CrossRef]
- Gao, X.; Yourick, M.R.; Campasino, K.; Zhao, Y.; Sepehr, E.; Vaught, C.; Sprando, R.L.; Yourick, J.J. An updated comparison of microarray and RNA-seq for concentration response transcriptomic study: Case studies with two cannabinoids, cannabichromene and cannabinol. BMC Genom. 2025, 26, 392. [Google Scholar] [CrossRef]
- Ritchie, M.E.; Phipson, B.; Wu, D.; Hu, Y.; Law, C.W.; Shi, W.; Smyth, G.K. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015, 43, e47. [Google Scholar] [CrossRef] [PubMed]
- Huang, D.W.; Sherman, B.T.; Lempicki, R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2009, 4, 44–57. [Google Scholar] [CrossRef] [PubMed]
- Candia, J.; Ferrucci, L. Assessment of Gene Set Enrichment Analysis using curated RNA-seq-based benchmarks. PLoS ONE 2024, 19, e0302696. [Google Scholar] [CrossRef]
- Subramanian, A.; Tamayo, P.; Mootha, V.K.; Mukherjee, S.; Ebert, B.L.; Gillette, M.A.; Paulovich, A.; Pomeroy, S.L.; Golub, T.R.; Lander, E.S.; et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 2005, 102, 15545–15550. [Google Scholar] [CrossRef]
- Bolstad, B.M.; Irizarry, R.A.; Åstrand, M.; Speed, T.P. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 2003, 19, 185–193. [Google Scholar] [CrossRef]
- Tan, D.S.P.; Lambros, M.B.; Natrajan, R.; Reis-Filho, J.S. Getting it right: Designing microarray (and not ‘microawry’) comparative genomic hybridization studies for cancer research. Lab. Investig. 2007, 87, 737–754. [Google Scholar] [CrossRef]
- Piccolo, S.R.; Sun, Y.; Campbell, J.D.; Lenburg, M.E.; Bild, A.H.; Johnson, W.E. A single-sample microarray normalization method to facilitate personalized-medicine workflows. Genomics 2012, 100, 337–344. [Google Scholar] [CrossRef]
- Rhodius, V.A.; Gross, C.A. Using DNA microarrays to assay part function. Methods Enzymol. 2011, 497, 75–113. [Google Scholar]
- Leek, J.T.; Scharpf, R.B.; Corrada-Bravo, H.; Simcha, D.; Langmead, B.; Johnson, W.E.; Geman, D.; Baggerly, K.; Irizarry, R.A. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat. Rev. Genet. 2010, 11, 733–739. [Google Scholar] [CrossRef]
- Agapito, G.; Milano, M.; Cannataro, M. A statistical network pre-processing method to improve relevance and significance of gene lists in microarray gene expression studies. BMC Bioinform. 2022, 23, 393. [Google Scholar] [CrossRef]
- Tzec-Interián, J.A.; González-Padilla, D.; Góngora-Castillo, E.B. Bioinformatics perspectives on transcriptomics: A comprehensive review of bulk and single-cell RNA sequencing analyses. Quant. Biol. 2025, 13, e78. [Google Scholar] [CrossRef]
- Donato, L.; Bramanti, P.; Scimone, C.; Rinaldi, C.; D’Angelo, R.; Sidoti, A. miRNA expression profile of retinal pigment epithelial cells under oxidative stress conditions. FEBS Open Bio 2018, 8, 219–233. [Google Scholar] [CrossRef]
- You, J.; Corley, S.M.; Wen, L.; Hodge, C.; Höllhumer, R.; Madigan, M.C.; Wilkins, M.R.; Sutton, G. RNA-Seq analysis and comparison of corneal epithelium in keratoconus and myopia patients. Sci. Rep. 2018, 8, 389. [Google Scholar] [CrossRef]
- Lozano, D.C.; Choi, D.; Jayaram, H.; Morrison, J.C.; Johnson, E.C. Utilizing RNA-Seq to Identify Differentially Expressed Genes in Glaucoma Model Tissues, Such as the Rodent Optic Nerve Head. In Methods in Molecular Biology; Springer: New York, NY, USA, 2018; pp. 299–310. [Google Scholar]
- Anand, D.; Kakrana, A.; Siddam, A.D.; Huang, H.; Saadi, I.; Lachke, S.A. RNA sequencing-based transcriptomic profiles of embryonic lens development for cataract gene discovery. Hum. Genet. 2018, 137, 941–954. [Google Scholar] [CrossRef]
- Love, M.I.; Huber, W.; Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014, 15, 550. [Google Scholar] [CrossRef]
- Hardcastle, T.J.; Kelly, K.A. baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinform. 2010, 11, 422. [Google Scholar] [CrossRef]
- Robinson, M.D.; McCarthy, D.J.; Smyth, G.K. edgeR: A Bioconductor package f1or differential expression analysis of digital gene expression data. Bioinformatics 2010, 26, 139–140. [Google Scholar] [CrossRef]
- Zeng, I.S.L.; Lumley, T. Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science). Bioinform. Biol. Insights 2018, 12, 117793221875929. [Google Scholar] [CrossRef]
- Yu, G.; Wang, L.-G.; Han, Y.; He, Q.-Y. clusterProfiler: An R Package for Comparing Biological Themes Among Gene Clusters. OMICS A J. Integr. Biol. 2012, 16, 284–287. [Google Scholar] [CrossRef]
- Deshpande, D.; Chhugani, K.; Chang, Y.; Karlsberg, A.; Loeffler, C.; Zhang, J.; Muszyńska, A.; Munteanu, V.; Yang, H.; Rotman, J.; et al. RNA-seq data science: From raw data to effective interpretation. Front. Genet. 2023, 14, 997383. [Google Scholar] [CrossRef]
- Koch, C.M.; Chiu, S.F.; Akbarpour, M.; Bharat, A.; Ridge, K.M.; Bartom, E.T.; Winter, D.R. A Beginner’s Guide to Analysis of RNA Sequencing Data. Am. J. Respir. Cell Mol. Biol. 2018, 59, 145–157. [Google Scholar] [CrossRef]
- Van den Berge, K.; Hembach, K.M.; Soneson, C.; Tiberi, S.; Clement, L.; Love, M.I.; Patro, R.; Robinson, M.D. RNA sequencing data: Hitchhiker’s guide to expression analysis. Annu. Rev. Biomed. Data Sci. 2019, 2, 139–173. [Google Scholar] [CrossRef]
- Yu, Y.; Mai, Y.; Zheng, Y.; Shi, L. Assessing and mitigating batch effects in large-scale omics studies. Genome Biol. 2024, 25, 254. [Google Scholar] [CrossRef]
- Zhao, S.; Ye, Z.; Stanton, R. Misuse of RPKM or TPM normalization when comparing across samples and sequencing protocols. RNA 2020, 26, 903–909. [Google Scholar] [CrossRef]
- Heil, B.J.; Crawford, J.; Greene, C.S. The effect of non-linear signal in classification problems using gene expression. PLoS Comput. Biol. 2023, 19, e1010984. [Google Scholar] [CrossRef]
- Han, D.; He, X. Screening for biomarkers in age-related macular degeneration. Heliyon 2023, 9, e16981. [Google Scholar] [CrossRef]
- Huang, J.; Zhou, Q. CD8+T Cell-Related Gene Biomarkers in Macular Edema of Diabetic Retinopathy. Front. Endocrinol. 2022, 13, 907396. [Google Scholar] [CrossRef]
- Huang, J.; Zhou, Q. Gene Biomarkers Related to Th17 Cells in Macular Edema of Diabetic Retinopathy: Cutting-Edge Comprehensive Bioinformatics Analysis and In Vivo Validation. Front. Immunol. 2022, 13, 858972. [Google Scholar] [CrossRef]
- Libbrecht, M.W.; Noble, W.S. Machine learning applications in genetics and genomics. Nat. Rev. Genet. 2015, 16, 321–332. [Google Scholar] [CrossRef]
- Cheng, Z.; Hao, J.; Cai, S.; Feng, P.; Chen, W.; Ma, X.; Li, X. A novel combined oxidative stress and extracellular matrix related predictive gene signature for keratoconus. Biochem. Biophys. Res. Commun. 2025, 742, 151144. [Google Scholar] [CrossRef]
- Tang, F.; Barbacioru, C.; Wang, Y.; Nordman, E.; Lee, C.; Xu, N.; Wang, X.; Bodeau, J.; Tuch, B.B.; Siddiqui, A.; et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nat. Methods 2009, 6, 377–382. [Google Scholar] [CrossRef]
- Miao, Z.; Moreno, P.; Huang, N.; Papatheodorou, I.; Brazma, A.; Teichmann, S.A. Putative cell type discovery from single-cell gene expression data. Nat. Methods 2020, 17, 621–628. [Google Scholar] [CrossRef]
- Hwang, B.; Lee, J.H.; Bang, D. Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp. Mol. Med. 2018, 50, 96. [Google Scholar] [CrossRef]
- Denisenko, E.; Guo, B.B.; Jones, M.; Hou, R.; de Kock, L.; Lassmann, T.; Poppe, D.; Clément, O.; Simmons, R.K.; Lister, R.; et al. Systematic assessment of tissue dissociation and storage biases in single-cell and single-nucleus RNA-seq workflows. Genome Biol. 2020, 21, 130. [Google Scholar] [CrossRef]
- Rich, J.M.; Moses, L.; Einarsson, P.H.; Jackson, K.; Luebbert, L.; Booeshaghi, A.S.; Antonsson, S.; Sullivan, D.K.; Bray, N.; Melsted, P.; et al. The impact of package selection and versioning on single-cell RNA-seq analysis. bioRxiv 2024. bioRxiv:2024.04.04.588111. [Google Scholar] [CrossRef]
- Hu, Z.; Ahmed, A.A.; Yau, C. CIDER: An interpretable meta-clustering framework for single-cell RNA-seq data integration and evaluation. Genome Biol. 2021, 22, 337. [Google Scholar] [CrossRef]
- Blondel, V.D.; Guillaume, J.-L.; Lambiotte, R.; Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech. Theory Exp. 2008, 2008, P10008. [Google Scholar] [CrossRef]
- Traag, V.A.; Waltman, L.; Van Eck, N.J. From Louvain to Leiden: Guaranteeing well-connected communities. Sci. Rep. 2019, 9, 5233. [Google Scholar] [CrossRef]
- van der Maaten, L.; Hinton, G. Visualizing high-dimensional data using t-sne. J. Mach. Learn. Res. 2008, 9, 2579. [Google Scholar]
- Li, R.; Liu, J.; Yi, P.; Yang, X.; Chen, J.; Zhao, C.; Liao, X.; Wang, X.; Xu, Z.; Lu, H.; et al. Integrative Single-Cell Transcriptomics and Epigenomics Mapping of the Fetal Retina Developmental Dynamics. Adv. Sci. 2023, 10, 2206623. [Google Scholar] [CrossRef]
- Zhang, J.; Du, T.; Jin, Y.; Bao, Y.; Ma, Q.; Cai, Y.-D.; Zhang, J. Machine Learning Identifies Key Gene Markers Related to Fetal Retina Development at Single-Cell Transcription Level. Investig. Ophthalmol. Vis. Sci. 2025, 66, 60. [Google Scholar] [CrossRef]
- Yazici, İ.; Shayea, I.; Din, J. A survey of applications of artificial intelligence and machine learning in future mobile networks-enabled systems. Eng. Sci. Technol. Int. J. 2023, 44, 101455. [Google Scholar] [CrossRef]
- Voigt, A.P.; Mullin, N.K.; Stone, E.M.; Tucker, B.A.; Scheetz, T.E.; Mullins, R.F. Single-cell RNA sequencing in vision research: Insights into human retinal health and disease. Prog. Retin. Eye Res. 2021, 83, 100934. [Google Scholar] [CrossRef]
- Wang, Y.; Miller, D.J.; Clarke, R. Approaches to working in high-dimensional data spaces: Gene expression microarrays. Br. J. Cancer 2008, 98, 1023–1028. [Google Scholar] [CrossRef]
- Angermueller, C.; Pärnamaa, T.; Parts, L.; Stegle, O. Deep learning for computational biology. Mol. Syst. Biol. 2016, 12, 878. [Google Scholar] [CrossRef]
- Oleynik, M.; Kugic, A.; Kasáč, Z.; Kreuzthaler, M. Evaluating shallow and deep learning strategies for the 2018 n2c2 shared task on clinical text classification. J. Am. Med. Inf. Assoc. 2019, 26, 1247–1254. [Google Scholar] [CrossRef]
- Norrie, J.L.; Lupo, M.S.; Little, D.R.; Shirinifard, A.; Mishra, A.; Zhang, Q.; Geiger, N.; Putnam, D.; Djekidel, N.; Ramirez, C.; et al. Latent epigenetic programs in Müller glia contribute to stress and disease response in the retina. Dev. Cell 2025, 60, 1199–1216. [Google Scholar] [CrossRef] [PubMed]
- Dong, Z.; Wang, C.; Dou, S.; Yang, X.; Wang, D.; Shi, K.; Wu, N. JAK1, SKI, ZBTB16 as potential biomarkers mediate the inflammatory response in keratoconjunctivitis sicca. Gene 2024, 927, 148691. [Google Scholar] [CrossRef]
- Ringnér, M. What is principal component analysis? Nat. Biotechnol. 2008, 26, 303–304. [Google Scholar] [CrossRef]
- McInnes, L.; Healy, J.; Melville, J. UMAP: Uniform Manifold Approximation and Projection for Dimensionality Reduction. arXiv 2018, arXiv:1802.03426. [Google Scholar]
- Islam, S.; Anand, S.; Hamid, J.; Thabane, L.; Beyene, J. Comparing the performance of linear and nonlinear principal components in the context of high-dimensional genomic data integration. Stat. Appl. Genet. Mol. Biol. 2017, 16, 199–216. [Google Scholar] [CrossRef]
- Nayak, R.; Hasija, Y. A hitchhiker’s guide to single-cell transcriptomics and data analysis pipelines. Genomics 2021, 113, 606–619. [Google Scholar] [CrossRef]
- Van Dijk, D.; Sharma, R.; Nainys, J.; Yim, K.; Kathail, P.; Carr, A.J.; Burdziak, C.; Moon, K.R.; Chaffer, C.L.; Pattabiraman, D.; et al. Recovering Gene Interactions from Single-Cell Data Using Data Diffusion. Cell 2018, 174, 716–729.e27. [Google Scholar] [CrossRef]
- Stegle, O.; Teichmann, S.A.; Marioni, J.C. Computational and analytical challenges in single-cell transcriptomics. Nat. Rev. Genet. 2015, 16, 133–145. [Google Scholar] [CrossRef]
- Wang, S.K.; Nair, S.; Li, R.; Kraft, K.; Pampari, A.; Patel, A.; Kang, J.B.; Luong, C.; Kundaje, A.; Chang, H.Y. Single-cell multiome of the human retina and deep learning nominate causal variants in complex eye diseases. Cell Genom. 2022, 2, 100164. [Google Scholar] [CrossRef]
- Haghverdi, L.; Lun, A.T.L.; Morgan, M.D.; Marioni, J.C. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat. Biotechnol. 2018, 36, 421–427. [Google Scholar] [CrossRef]
- Korsunsky, I.; Millard, N.; Fan, J.; Slowikowski, K.; Zhang, F.; Wei, K.; Baglaenko, Y.; Brenner, M.; Loh, P.-R.; Raychaudhuri, S. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 2019, 16, 1289–1296. [Google Scholar] [CrossRef]
- Zhang, S.; Li, X.; Lin, J.; Lin, Q.; Wong, K.-C. Review of single-cell RNA-seq data clustering for cell-type identification and characterization. RNA 2023, 29, 517–530. [Google Scholar] [CrossRef]
- Johnson, K.A.; Krishnan, A. Robust normalization and transformation techniques for constructing gene coexpression networks from RNA-seq data. Genome Biol. 2022, 23, 1. [Google Scholar] [CrossRef]
- Jaskowiak, P.A.; Campello, R.J.; Costa, I.G. On the selection of appropriate distances for gene expression data clustering. BMC Bioinform. 2014, 15, S2. [Google Scholar] [CrossRef]
- Do, J.H.; Choi, D.-K. Clustering Approaches to Identifying Gene Expression Patterns from DNA Microarray Data. Mol. Cells 2008, 25, 279–288. [Google Scholar] [CrossRef]
- Pantano, L.; Hutchinson, J.; Barrera, V.; Kirchner, R.; Steinbaugh, M. DEGreport: Report of DEG analysis. Bioconductor, 15 April 2025. [Google Scholar]
- Langfelder, P.; Horvath, S. WGCNA: An R package for weighted correlation network analysis. BMC Bioinform. 2008, 9, 559. [Google Scholar] [CrossRef]
- Ma, K.; Nakajima, H.; Basak, N.; Barman, A.; Ratnapriya, R. Integrating explainable machine learning and transcriptomics data reveals cell-type specific immune signatures underlying macular degeneration. npj Genom. Med. 2025, 10, 48. [Google Scholar] [CrossRef]
- Saelens, W.; Cannoodt, R.; Todorov, H.; Saeys, Y. A comparison of single-cell trajectory inference methods. Nat. Biotechnol. 2019, 37, 547–554. [Google Scholar] [CrossRef]
- Setty, M.; Kiseliovas, V.; Levine, J.; Gayoso, A.; Mazutis, L.; Pe’Er, D. Characterization of cell fate probabilities in single-cell data with Palantir. Nat. Biotechnol. 2019, 37, 451–460. [Google Scholar] [CrossRef]
- Butler, A.; Hoffman, P.; Smibert, P.; Papalexi, E.; Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 2018, 36, 411–420. [Google Scholar] [CrossRef]
- Stuart, T.; Butler, A.; Hoffman, P.; Hafemeister, C.; Papalexi, E.; Mauck, W.M.; Hao, Y.; Stoeckius, M.; Smibert, P.; Satija, R. Comprehensive Integration of Single-Cell Data. Cell 2019, 177, 1888–1902.e21. [Google Scholar] [CrossRef]
- Jia, X.; Wu, J.; Chen, X.; Hou, S.; Li, Y.; Zhao, L.; Zhu, Y.; Li, Z.; Deng, C.; Su, W.; et al. Cell atlas of trabecular meshwork in glaucomatous non-human primates and DEGs related to tissue contract based on single-cell transcriptomics. iScience 2023, 26, 108024. [Google Scholar] [CrossRef]
- Welch, J.D.; Kozareva, V.; Ferreira, A.; Vanderburg, C.; Martin, C.; Macosko, E.Z. Single-Cell Multi-omic Integration Compares and Contrasts Features of Brain Cell Identity. Cell 2019, 177, 1873–1887.e17. [Google Scholar] [CrossRef]
- Avila Cobos, F.; Alquicira-Hernandez, J.; Powell, J.E.; Mestdagh, P.; De Preter, K. Benchmarking of cell type deconvolution pipelines for transcriptomics data. Nat. Commun. 2020, 11, 5650. [Google Scholar] [CrossRef]
- Gong, T.; Szustakowski, J.D. DeconRNASeq: A statistical framework for deconvolution of heterogeneous tissue samples based on mRNA-Seq data. Bioinformatics 2013, 29, 1083–1085. [Google Scholar] [CrossRef]
- Jin, H.; Liu, Z. A benchmark for RNA-seq deconvolution analysis under dynamic testing environments. Genome Biol. 2021, 22, 102. [Google Scholar] [CrossRef]
- Newman, A.M.; Steen, C.B.; Liu, C.L.; Gentles, A.J.; Chaudhuri, A.A.; Scherer, F.; Khodadoust, M.S.; Esfahani, M.S.; Luca, B.A.; Steiner, D.; et al. Determining cell type abundance and expression from bulk tissues with digital cytometry. Nat. Biotechnol. 2019, 37, 773–782. [Google Scholar] [CrossRef]
- Newman, A.M.; Liu, C.L.; Green, M.R.; Gentles, A.J.; Feng, W.; Xu, Y.; Hoang, C.D.; Diehn, M.; Alizadeh, A.A. Robust enumeration of cell subsets from tissue expression profiles. Nat. Methods 2015, 12, 453–457. [Google Scholar] [CrossRef]
- Miao, Y.R.; Zhang, Q.; Lei, Q.; Luo, M.; Xie, G.Y.; Wang, H.; Guo, A.Y. ImmuCellAI: A Unique Method for Comprehensive T-Cell Subsets Abundance Prediction and its Application in Cancer Immunotherapy. Adv. Sci. 2020, 7, 1902880. [Google Scholar] [CrossRef]
- Das, S.; McClain, C.J.; Rai, S.N. Fifteen Years of Gene Set Analysis for High-Throughput Genomic Data: A Review of Statistical Approaches and Future Challenges. Entropy 2020, 22, 427. [Google Scholar] [CrossRef]
- Wang, Y.; Yang, X.; Zhang, Y.; Hong, L.; Xie, Z.; Jiang, W.; Chen, L.; Xiong, K.; Yang, S.; Lin, M.; et al. Single-cell RNA sequencing reveals roles of unique retinal microglia types in early diabetic retinopathy. Diabetol. Metab. Syndr. 2024, 16, 49. [Google Scholar] [CrossRef]
- Hänzelmann, S.; Castelo, R.; Guinney, J. GSVA: Gene set variation analysis for microarray and RNA-Seq data. BMC Bioinform. 2013, 14, 7. [Google Scholar] [CrossRef]
- Wang, Z.; Huang, Y.; Chu, F.; Liao, K.; Cui, Z.; Chen, J.; Tang, S. Integrated Analysis of DNA methylation and transcriptome profile to identify key features of age-related macular degeneration. Bioengineered 2021, 12, 7061–7078. [Google Scholar] [CrossRef]
- Yousef, M.; Allmer, J. Deep learning in bioinformatics. Turk. J. Biol. 2023, 47, 366–382. [Google Scholar] [CrossRef]
- Lopez, R.; Regier, J.; Cole, M.B.; Jordan, M.I.; Yosef, N. Deep generative modeling for single-cell transcriptomics. Nat. Methods 2018, 15, 1053–1058. [Google Scholar] [CrossRef]
- Tran, D.; Nguyen, H.; Tran, B.; La Vecchia, C.; Luu, H.N.; Nguyen, T. Fast and precise single-cell data analysis using a hierarchical autoencoder. Nat. Commun. 2021, 12, 1029. [Google Scholar] [CrossRef] [PubMed]
- Eraslan, G.; Simon, L.M.; Mircea, M.; Mueller, N.S.; Theis, F.J. Single-cell RNA-seq denoising using a deep count autoencoder. Nat. Commun. 2019, 10, 390. [Google Scholar] [CrossRef] [PubMed]
- Dou, B.; Zhu, Z.; Merkurjev, E.; Ke, L.; Chen, L.; Jiang, J.; Zhu, Y.; Liu, J.; Zhang, B.; Wei, G.-W. Machine Learning Methods for Small Data Challenges in Molecular Science. Chem. Rev. 2023, 123, 8736–8780. [Google Scholar] [CrossRef]
- Tibshirani, R. Regression Shrinkage and Selection Via the Lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
- Simon, N.; Friedman, J.H.; Hastie, T.; Tibshirani, R. Regularization Paths for Cox’s Proportional Hazards Model via Coordinate Descent. J. Stat. Softw. 2011, 39, 1–13. [Google Scholar] [CrossRef]
- Zou, H.; Hastie, T. Regularization and Variable Selection via the Elastic Net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef]
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Huang, S.; Cai, N.; Pacheco, P.P.; Narrandes, S.; Wang, Y.; Xu, W. Applications of Support Vector Machine (SVM) Learning in Cancer Genomics. Cancer Genom. Proteom. 2018, 15, 41–51. [Google Scholar]
- Guyon, I.; Weston, J.; Barnhill, S.; Vapnik, V. Gene Selection for Cancer Classification using Support Vector Machines. Mach. Learn. 2002, 46, 389–422. [Google Scholar] [CrossRef]
- Ding, Y.; Wilkins, D. Improving the Performance of SVM-RFE to Select Genes in Microarray Data. BMC Bioinform. 2006, 7, S12. [Google Scholar] [CrossRef]
- Li, Z.; Xie, W.; Liu, T. Efficient feature selection and classification for microarray data. PLoS ONE 2018, 13, e0202167. [Google Scholar] [CrossRef]
- Díaz-Uriarte, R.; Alvarez de Andrés, S. Gene selection and classification of microarray data using random forest. BMC Bioinform. 2006, 7, 3. [Google Scholar] [CrossRef]
- Bentéjac, C.; Csörgő, A.; Martínez-Muñoz, G. A comparative analysis of gradient boosting algorithms. Artif. Intell. Rev. 2021, 54, 1937–1967. [Google Scholar] [CrossRef]
- Strobl, C.; Boulesteix, A.-L.; Zeileis, A.; Hothorn, T. Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinform. 2007, 8, 25. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Zhang, C.; Liu, C.; Zhang, X.; Almpanidis, G. An up-to-date comparison of state-of-the-art classification algorithms. Expert Syst. Appl. 2017, 82, 128–150. [Google Scholar] [CrossRef]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. In Advances in Neural Information Processing Systems; Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2018; pp. 6638–6648. [Google Scholar]
- Hancock, J.T.; Khoshgoftaar, T.M. CatBoost for big data: An interdisciplinary review. J. Big Data 2020, 7, 94. [Google Scholar] [CrossRef]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Advances in Neural Information Processing Systems; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2017; pp. 3146–3154. [Google Scholar]
- Freund, Y.; Schapire, R.E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef]
- Martinez, W.; Gray, J.B. Noise peeling methods to improve boosting algorithms. Comput. Stat. Data Anal. 2016, 93, 483–497. [Google Scholar] [CrossRef]
- Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef]
- Moerman, T.; Aibar Santos, S.; Bravo González-Blas, C.; Simm, J.; Moreau, Y.; Aerts, J.; Aerts, S.; Kelso, J. GRNBoost2 and Arboreto: Efficient and scalable inference of gene regulatory networks. Bioinformatics 2019, 35, 2159–2161. [Google Scholar] [CrossRef]
- Pratapa, A.; Jalihal, A.P.; Law, J.N.; Bharadwaj, A.; Murali, T.M. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat. Methods 2020, 17, 147–154. [Google Scholar] [CrossRef]
- Karamveer n Uzun, Y. Approaches for Benchmarking Single-Cell Gene Regulatory Network Methods. Bioinform. Biol. Insights 2024, 18, 11779322241287120. [Google Scholar] [CrossRef]
- Thompson, M.; Matsumoto, M.; Ma, T.; Senabouth, A.; Palpant, N.J.; Powell, J.E.; Nguyen, Q. scGPS: Determining Cell States and Global Fate Potential of Subpopulations. Front. Genet. 2021, 12, 666771. [Google Scholar] [CrossRef]
- Aran, D.; Hu, Z.; Butte, A.J. xCell: Digitally portraying the tissue cellular heterogeneity landscape. Genome Biol. 2017, 18, 220. [Google Scholar] [CrossRef]
- Sutton, G.J.; Poppe, D.; Simmons, R.K.; Walsh, K.; Nawaz, U.; Lister, R.; Gagnon-Bartsch, J.A.; Voineagu, I. Comprehensive evaluation of deconvolution methods for human brain gene expression. Nat. Commun. 2022, 13, 1358. [Google Scholar] [CrossRef]
- Chin, C.-H.; Chen, S.-H.; Wu, H.-H.; Ho, C.-W.; Ko, M.-T.; Lin, C.-Y. cytoHubba: Identifying hub objects and sub-networks from complex interactome. BMC Syst. Biol. 2014, 8, S11. [Google Scholar] [CrossRef]
- Bader, G.D.; Hogue, C.W. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform. 2003, 4, 2. [Google Scholar] [CrossRef] [PubMed]
- Oca, A.I.; Pérez-Sala, Á.; Pariente, A.; Ochoa, R.; Velilla, S.; Peláez, R.; Larráyoz, I.M. Predictive Biomarkers of Age-Related Macular Degeneration Response to Anti-VEGF Treatment. J. Pers. Med. 2021, 11, 1329. [Google Scholar] [CrossRef]
- Toh, H.; Smolentsev, A.; Sadjadi, R.; Clegg, D.; Yan, J.; Stewart, R.; Thomson, J.A.; Jiang, P. Transcriptomic clock predicts vascular changes of prodromal diabetic retinopathy. Sci. Rep. 2023, 13, 12968. [Google Scholar] [CrossRef]
- Laich, Y.; Wolf, J.; Hajdu, R.I.; Schlecht, A.; Bucher, F.; Pauleikhoff, L.; Busch, M.; Martin, G.; Faatz, H.; Killmer, S.; et al. Single-Cell Protein and Transcriptional Characterization of Epiretinal Membranes from Patients with Proliferative Vitreoretinopathy. Investig. Ophthalmol. Vis. Sci. 2022, 63, 17. [Google Scholar] [CrossRef]
- Goetz, J.; Jessen, Z.F.; Jacobi, A.; Mani, A.; Cooler, S.; Greer, D.; Kadri, S.; Segal, J.; Shekhar, K.; Sanes, J.R.; et al. Unified classification of mouse retinal ganglion cells using function, morphology, and gene expression. Cell Rep. 2022, 40, 111040. [Google Scholar] [CrossRef]
- Zhao, S.; Dai, Q.; Rao, Z.; Li, J.; Wang, A.; Gao, Z.; Fan, Y. Identification of Optic Nerve–Related Biomarkers in Primary Open-Angle Glaucoma Based on Comprehensive Bioinformatics and Mendelian Randomization. Transl. Vis. Sci. Technol. 2024, 13, 21. [Google Scholar] [CrossRef]
- Shu, X.; Zeng, C.; Zhu, Y.; Chen, Y.; Huang, X.; Wei, R. Screening of pathologically significant diagnostic biomarkers in tears of thyroid eye disease based on bioinformatic analysis and machine learning. Front. Cell Dev. Biol. 2024, 12, 1486170. [Google Scholar] [CrossRef]
- Lachke, S.A.; Ho, J.W.K.; Kryukov, G.V.; O’Connell, D.J.; Aboukhalil, A.; Bulyk, M.L.; Park, P.J.; Maas, R.L. iSyTE: Integrated Systems Tool for Eye gene discovery. Investig. Ophthalmol. Vis. Sci. 2012, 53, 1617–1627. [Google Scholar] [CrossRef]
- Tangeman, J.A.; Rebull, S.M.; Grajales-Esquivel, E.; Weaver, J.M.; Bendezu-Sayas, S.; Robinson, M.L.; Lachke, S.A.; Del Rio-Tsonis, K. Integrated single-cell multiomics uncovers foundational regulatory mechanisms of lens development and pathology. Development 2024, 151, dev202249. [Google Scholar] [CrossRef]
- Disatham, J.; Brennan, L.A.; Kantorow, M. Epigenetic regulation during lens fiber cell differentiation. Epigenet. Chromatin 2022, 15, 9. [Google Scholar]
- Disatham, J.; Brennan, L.; Kantorow, M.; Cvekl, A. Profiling chromatin accessibility during lens development reveals regulatory motif dynamics and Pax6 involvement. Epigenet. Chromatin 2019, 12, 55. [Google Scholar]
- Jiang, J.; Shihan, M.H.; Wang, Y.; Duncan, M.K. Lens Epithelial Cells Initiate an Inflammatory Response Following Cataract Surgery. Investig. Ophthalmol. Vis. Sci. 2018, 59, 4986–4997. [Google Scholar] [CrossRef]
- Faranda, A.P.; Shihan, M.H.; Wang, Y.; Duncan, M.K. The aging mouse lens transcriptome. Exp. Eye Res. 2021, 209, 108663. [Google Scholar] [CrossRef]
- Novo, S.G.; Faranda, A.P.; D’Antin, J.C.; Wang, Y.; Shihan, M.; Barraquer, R.I.; Michael, R.; Duncan, M.K. Human lens epithelial cells induce the inflammatory response when placed into the lens capsular bag model of posterior capsular opacification. Mol. Vis. 2024, 30, 348–367. [Google Scholar]
- Duot, M.; Coomson, S.Y.; Shrestha, S.K.; Nagulla, M.M.K.; Audic, Y.; Barve, R.A.; Huang, H.; Gautier-Courteille, C.; Paillard, L.; Lachke, S.A. Transcriptome Meta-Analysis Uncovers Cell-Specific Regulatory Relationships in Embryonic, Juvenile, Adult, and Aged Mouse Lens Epithelium and Fibers. Investig. Ophthalmol. Vis. Sci. 2025, 66, 42. [Google Scholar] [CrossRef]
- Gorai, S.; Faranda, A.P.; Shihan, M.H.; Wang, Y.; Duncan, M.K. LIRTS Viewer: A Web-Based Resource to View the Transcriptional Response of Lens Epithelial Cells to Injury. Investig. Ophthalmol. Vis. Sci. 2025, 66, 53. [Google Scholar] [CrossRef]
- Zhao, Y.; Zheng, D.; Cvekl, A. A comprehensive spatial-temporal transcriptomic analysis of differentiating nascent mouse lens epithelial and fiber cells. Exp. Eye Res. 2018, 175, 56–72. [Google Scholar] [CrossRef]
- Zhao, Y.; Wilmarth, P.A.; Cheng, C.; Limi, S.; Fowler, V.M.; Zheng, D.; David, L.L.; Cvekl, A. Proteome-transcriptome analysis and proteome remodeling in mouse lens epithelium and fibers. Exp. Eye Res. 2019, 179, 32–46. [Google Scholar] [CrossRef]
- Disatham, J.; Brennan, L.; Cvekl, A.; Kantorow, M. Multiomics Analysis Reveals Novel Genetic Determinants for Lens Differentiation, Structure, and Transparency. Biomolecules 2023, 13, 693. [Google Scholar] [CrossRef] [PubMed]
- Hao, C.; Li, K.; Wei, Z.; Radeen, K.R.; Zhang, X.; Purohit, S.; Fan, X. Transcriptomic Analysis of Human Lens Epithelium Tissue With and Without Cataract Surgery: Uncovering Novel Pathways of Post-Surgical Lens Epithelium Remodeling. Investig. Ophthalmol. Vis. Sci. 2025, 66, 28. [Google Scholar] [CrossRef]
- Lalman, C.; Stabler, K.R.; Yang, Y.; Walker, J.L. Supervised machine-based learning and computational analysis to reveal unique molecular signatures associated with wound healing and fibrotic outcomes to lens injury. Int. J. Mol. Sci. 2025, 26, 7422. [Google Scholar] [CrossRef]
- Kakati, T.; Bhattacharyya, D.K.; Kalita, J.K.; Norden-Krichmar, T.M. DEGnext: Classification of differentially expressed genes from RNA-seq data using a convolutional neural network with transfer learning. BMC Bioinform. 2022, 23, 17. [Google Scholar] [CrossRef]
- Zeng, H.; Edwards, M.D.; Liu, G.; Gifford, D.K. Convolutional neural network architectures for predicting DNA–protein binding. Bioinformatics 2016, 32, i121–i127. [Google Scholar] [CrossRef]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar]
- Ekanayake, I.U.; Meddage, D.P.P.; Rathnayake, U. A novel approach to explain the black-box nature of machine learning in compressive strength predictions of concrete using Shapley additive explanations (SHAP). Case Stud. Constr. Mater. 2022, 16, e01059. [Google Scholar] [CrossRef]
- Budhkar, A.; Song, Q.; Su, J.; Zhang, X. Demystifying the black box: A survey on explainable artificial intelligence (XAI) in bioinformatics. Comput. Struct. Biotechnol. J. 2025, 27, 346–359. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.-I. Advances in Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 4765–4774. [Google Scholar]
- Liu, J.; Gao, J.; Xing, S.; Yan, Y.; Yan, X.; Jing, Y.; Li, X. Bioinformatics analysis of signature genes related to cell death in keratoconus. Sci. Rep. 2024, 14, 12749. [Google Scholar] [CrossRef] [PubMed]
- Ahmed, Z.; Wan, S.; Zhang, F.; Zhong, W. Artificial intelligence for omics data analysis. BMC Methods 2024, 1, 4. [Google Scholar] [CrossRef]
- Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef]
- Min, S.; Lee, B.; Yoon, S. Deep learning in bioinformatics. Brief. Bioinform. 2017, 18, 851–869. [Google Scholar] [CrossRef]
- Rung, J.; Brazma, A. Reuse of public genome-wide gene expression data. Nat. Rev. Genet. 2013, 14, 89–99. [Google Scholar] [CrossRef] [PubMed]
- Topol, E.J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 2019, 25, 44–56. [Google Scholar] [CrossRef]
- Oestreich, M.; Chen, D.; Schultze, J.L.; Fritz, M.; Becker, M. Privacy considerations for sharing genomics data. EXCLI J. 2021, 20, 1243–1260. [Google Scholar]
- Konnoth, C. AI and data protection law in health. In Research Handbook on Health, AI and the Law; Solaiman, B., Cohen, I.G., Eds.; Edward Elgar Publishing Ltd.: Cheltenham, UK, 2024; Chapter 7. [Google Scholar] [CrossRef]
- Abbas, S.R.; Abbas, Z.; Zahir, A.; Lee, S.W. Advancing genome-based precision medicine: A review on machine learning applications for rare genetic disorders. Brief. Bioinform. 2025, 26, bbaf329. [Google Scholar] [CrossRef]
- Pham, T. Ethical and legal considerations in healthcare AI: Innovation and policy for safe and fair use. R. Soc. Open Sci. 2025, 12, 241873. [Google Scholar] [CrossRef]
- Haider, S.; Pal, R. Integrated analysis of transcriptomic and proteomic data. Curr. Genom. 2013, 14, 91–110. [Google Scholar] [CrossRef] [PubMed]
- Argelaguet, R.; Velten, B.; Arnol, D.; Dietrich, S.; Zenz, T.; Marioni, J.C.; Buettner, F.; Huber, W.; Stegle, O. Multi-Omics Factor Analysis—A framework for unsupervised integration of multi-omics data sets. Mol. Syst. Biol. 2018, 14, e8124. [Google Scholar] [CrossRef] [PubMed]
- Lee, C.H.; Yoon, H.-J. Medical big data: Promise and challenges. Kidney Res. Clin. Pract. 2017, 36, 3–11. [Google Scholar] [CrossRef] [PubMed]
- Wolf, J.; Franco, J.A.; Yip, R.; Dabaja, M.Z.; Velez, G.; Liu, F.; Bassuk, A.G.; Mruthyunjaya, P.; Dufour, A.; Mahajan, V.B. Liquid Biopsy Proteomics in Ophthalmology. J. Proteome Res. 2024, 23, 511–522. [Google Scholar] [CrossRef]
- Schaub, J.M.; Fu, D.J.; van Velthoven, C.T.J.; Park, D.Y.; Lin, J.H.; Lee, C.S. A comprehensive review of artificial intelligence models for screening major retinal diseases. Artif. Intell. Rev. 2024, 57, 3487–3518. [Google Scholar] [CrossRef]
Algorithm | Modality | Biological Relevance |
---|---|---|
PCA, t-SNE, UMAP | Microarray, bulk RNA-seq, scRNA-seq | Dimensionality reduction |
MAGIC | scRNA-seq | Imputes missing values/dropouts |
Harmony | scRNA-seq | Corrects for batch effects |
DEGreport | Microarray, bulk RNA-seq, scRNA-seq | Groups genes by correlated expression patterns |
WGCNA | Microarray, bulk RNA-seq, scRNA-seq | Clusters groups of genes based on expression profiles |
Leiden | scRNA-seq | Detects cell communities by clustering single-cell transcriptomes |
Monocle3, Palantir | scRNA-seq | Reconstructs developmental trajectories |
Seurat | Microarray, bulk RNA-seq, scRNA-seq | Comprehensive toolkit for clustering, dimensionality reduction, and batch correction |
LIGER | scRNA-seq | Harmonizes data from multiple datasets |
CIBERSORT, CIBERSORTx | Microarray, bulk RNA-seq | Estimates cell type composition from mixed tissue samples |
GSVA | Microarray, bulk RNA-seq | Estimates variation in pathway activity across a sample population |
Algorithm | Category | What It Does | Strength | Limitations |
---|---|---|---|---|
LASSO | Linear model | Selects a small set of predictive genes by shrinking the contribution of less significant genes [119] | Avoids overfitting, more interpretable [120] | Assumes linear relationships; can underperform with correlated predictors [121] |
SVM | Classifier | Separates classes by finding the optimal boundary [122] | Good for complex high-dimensional gene data with few samples [123] | Requires tuning |
SVM-RFE | Classifier with feature elimination | Iteratively removes uninformative genes [124] | Good for complex, nonlinear, high-dimensional data [124] | Slow; does not account for correlated features [125,126] |
RF | Decision tree ensemble | Combines many trees to improve accuracy and estimate gene importance [127] | Robust to noise and overfitting; more interpretable and precise [127,128] | Feature importance can be biased toward variables with more categories or more split points (e.g., continuous features) [129] |
XGBoost | Gradient boosting (ensemble) | Sequentially builds decision trees to correct previous errors [130] | Handles missing values, fast and efficient [130] | Can overfit without tuning [128,131] |
CatBoost | Gradient boosting (categorical) | Deals well with categorical variables, with strong generalization accuracy [128,132] | Performs well on mixed data types; minimal preprocessing of categorical features [133] | Different hyperparameters can significantly change speed/accuracy [133] |
LightGBM | Gradient boosting | Uses histogram-based learning [134] | Efficient memory usage; fast training; supports large-scale data [134] | Performance may degrade on datasets with extremely high-cardinality categorical features without tuning [132] |
AdaBoost | Boosted ensemble | Combines many models, correcting for mistakes made by earlier versions [135] | Fast, avoids overfitting, and handles nonlinear data well [135] | Sensitive to noise and outliers [136] |
ExtraTrees | Randomized tree ensemble | Builds multiple decision trees with extra randomness to reduce overfitting [137] | Fast and avoids overfitting [137] | Less accurate and harder to interpret [137] |
GRNBoost2 | Tree-based with network interference | Reconstructs gene regulatory networks [138] | Captures nonlinear regulatory relationships [138] | Requires large datasets; sensitive to noise [139,140] |
scGPS | Classifier [139,140] with projection scoring | Trains classifiers on labeled cell subpopulations to infer trajectories and inter-sample similarity [141] | Good for comparing single-cell populations and predicting cell fates across datasets [8] | Performance may drop with novel or underrepresented cell types [141] |
xCell | Signature-based deconvolution | Estimates relative enrichment of immune cells [142] | Robust to noise, requires no retraining [142] | Limited to predefined signatures [143] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lalman, C.; Yang, Y.; Walker, J.L. Artificial Intelligence in Ocular Transcriptomics: Applications of Unsupervised and Supervised Learning. Cells 2025, 14, 1315. https://doi.org/10.3390/cells14171315
Lalman C, Yang Y, Walker JL. Artificial Intelligence in Ocular Transcriptomics: Applications of Unsupervised and Supervised Learning. Cells. 2025; 14(17):1315. https://doi.org/10.3390/cells14171315
Chicago/Turabian StyleLalman, Catherine, Yimin Yang, and Janice L. Walker. 2025. "Artificial Intelligence in Ocular Transcriptomics: Applications of Unsupervised and Supervised Learning" Cells 14, no. 17: 1315. https://doi.org/10.3390/cells14171315
APA StyleLalman, C., Yang, Y., & Walker, J. L. (2025). Artificial Intelligence in Ocular Transcriptomics: Applications of Unsupervised and Supervised Learning. Cells, 14(17), 1315. https://doi.org/10.3390/cells14171315