Deciphering Cell-Type-Specific Transcriptional Regulation in Tomato Leaves Through Ensemble Machine Learning and Single-Cell Transcriptomics
Abstract
1. Introduction
2. Materials and Methods
2.1. Data Sources
2.2. Process and Basic Analysis of Tomato Leaf scRNA-Seq Fastq Data
2.3. RNA Velocity and Cellular Trajectory Inference
2.4. Single-Cell Gene Co-Expression Network Analysis of Tomato TFs
2.5. Base GRN Construction of the Prior Gene Regulatory Network
2.6. Ensemble Machine Learning for Candidate Regulatory Factor Prioritization
2.7. In Silico Perturbation Simulations and Cell Fate Dynamics
3. Results
3.1. Single-Cell Transcriptomic Heterogeneity, Cell-Type Annotation, and Developmental Dynamics of Tomato Leaves
3.2. Weighted Gene Co-Expression Network Analysis Reveals Module-Level Organization and TF-Centric Developmental Dynamics in Tomato Leaf Cells
3.3. Machine Learning Prioritises Candidate TFs Associated with Tomato Leaf Cell Types
3.4. Integrative GRN Inference and Perturbation Modelling Prioritise Candidate Core Leaf Cell Regulators
4. Discussion
4.1. Comparative Insights into Leaf Cellular Heterogeneity and Developmental Ontogeny
4.2. Regulatory Logic of Modular TF Networks
4.3. Integrative Machine Learning and Perturbation Modeling Contextualize Core Regulatory Hubs
4.4. Limitation of This Research
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Ferreira, P.B.; Fanalli, S.L.; Oliveira, P.N.d.; Cesar, A.d.S.M.; Eloy, N.B. Transcriptomic analysis of early fruit development in Micro-Tom tomato reveals conserved and cultivar-specific mechanisms. Plants 2026, 15, 137. [Google Scholar] [CrossRef]
- Liu, W.; Liu, K.; Chen, D.; Zhang, Z.; Li, B.; El-Mogy, M.M.; Tian, S.; Chen, T. Solanum lycopersicum, a model plant for the studies in developmental biology, stress biology and food science. Foods 2022, 11, 2402. [Google Scholar] [CrossRef] [PubMed]
- Butturini, M.; Smoleňová, K.; Restina, J.; Stolz, J.; de Vries, J.; Marcelis, L.F.M. A functional–structural plant model for dwarf tomato ideotype identification in vertical farming. In Silico Plants 2026, 8, diaf024. [Google Scholar] [CrossRef]
- Guo, K.; Huang, C.; Miao, Y.; Cosgrove, D.J.; Hsia, K.J. Leaf morphogenesis: The multifaceted roles of mechanics. Mol. Plant 2022, 15, 1098–1119. [Google Scholar] [CrossRef] [PubMed]
- Lv, Z.; Zhao, W.; Kong, S.; Li, L.; Lin, S. Overview of molecular mechanisms of plant leaf development: A systematic review. Front. Plant Sci. 2023, 14, 1293424. [Google Scholar] [CrossRef]
- Guo, X.; Wang, Y.; Zhao, C.; Tan, C.; Yan, W.; Xiang, S.; Zhang, D.; Zhang, H.; Zhang, M.; Yang, L.; et al. An Arabidopsis single-nucleus atlas decodes leaf senescence and nutrient allocation. Cell 2025, 188, 2856–2871.e2816. [Google Scholar] [CrossRef]
- Liew, L.C.; You, Y.; Auroux, L.; Oliva, M.; Peirats-Llobet, M.; Ng, S.; Tamiru-Oli, M.; Berkowitz, O.; Hong, U.V.T.; Haslem, A.; et al. Establishment of single-cell transcriptional states during seed germination. Nat. Plants 2024, 10, 1418–1434. [Google Scholar] [CrossRef] [PubMed]
- Tenorio Berrío, R.; Verhelst, E.; Eekhout, T.; Grones, C.; De Veylder, L.; De Rybel, B.; Dubois, M. Dual and spatially resolved drought responses in the Arabidopsis leaf mesophyll revealed by single-cell transcriptomics. New Phytol. 2025, 246, 840–858. [Google Scholar] [CrossRef]
- Zhu, T.; Li, T.; Lü, P.; Li, C. Single-cell omics in plant biology: Mechanistic insights and applications for crop improvement. Adv. Biotechnol. 2025, 3, 20. [Google Scholar] [CrossRef]
- Yue, H.; Chen, G.; Zhang, Z.; Guo, Z.; Zhang, Z.; Zhang, S.; Turlings, T.C.J.; Zhou, X.; Peng, J.; Gao, Y.; et al. Single-cell transcriptome landscape elucidates the cellular and developmental responses to tomato chlorosis virus infection in tomato leaf. Plant Cell Environ. 2024, 47, 2660–2674. [Google Scholar] [CrossRef]
- Thilakarathne, A.S.; Liu, F.; Zou, Z. Plant signaling hormones and transcription factors: Key regulators of plant responses to growth, development, and stress. Plants 2025, 14, 1070. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.; Chen, W.; Xu, Z.; Chen, M.; Yu, D. Functions of WRKYs in plant growth and development. Trends Plant Sci. 2023, 28, 630–645. [Google Scholar] [CrossRef]
- Stock, M.; Losert, C.; Zambon, M.; Popp, N.; Lubatti, G.; Hörmanseder, E.; Heinig, M.; Scialdone, A. Leveraging prior knowledge to infer gene regulatory networks from single-cell RNA-sequencing data. Mol. Syst. Biol. 2025, 21, 214–230. [Google Scholar] [CrossRef] [PubMed]
- Stock, M.; Popp, N.; Fiorentino, J.; Scialdone, A. Topological benchmarking of algorithms to infer gene regulatory networks from single-cell RNA-seq data. Bioinformatics 2024, 40, btae267. [Google Scholar] [CrossRef]
- Tang, J.; Wang, C.; Xiao, F.; Xi, R. Single-cell gene regulatory network analysis for mixed cell populations. Quant. Biol. 2024, 12, 375–388. [Google Scholar] [CrossRef]
- Cho, J.; Baik, B.; Nguyen, H.C.T.; Park, D.; Nam, D. Characterizing efficient feature selection for single-cell expression analysis. Brief. Bioinform. 2024, 25, bbae317. [Google Scholar] [CrossRef]
- Fernandez-Pozo, N.; Menda, N.; Edwards, J.D.; Saha, S.; Tecle, I.Y.; Strickler, S.R.; Bombarely, A.; Fisher-York, T.; Pujar, A.; Foerster, H.; et al. The Sol Genomics Network (SGN)—From genotype to phenotype to breeding. Nucleic Acids Res. 2014, 43, 1036–1041. [Google Scholar] [CrossRef]
- Zheng, D.; Lu, X.; Lu, Y.; Liang, P.; Shang, N.; Xu, J.; Yao, J.; Mo, F.; Chu, Q.; Fan, L.; et al. PlantscRNAdb 4.0: Improved marker identification and annotation under a cell-type uniformity for plants. Mol. Plant 2026, 19, 673–688. [Google Scholar] [CrossRef]
- Ding, K.; Sun, S.; Luo, Y.; Long, C.; Zhai, J.; Zhai, Y.; Wang, G. PlantCADB: A comprehensive plant chromatin accessibility database. Genom. Proteom. Bioinform. 2023, 21, 311–323. [Google Scholar] [CrossRef]
- Rivera-Silva, R.; Chávez Montes, R.A.; Jaimes-Miranda, F. Gene ontology functional annotation datasets for the ITAG3.2 and ITAG4.0 tomato (Solanum lycopersicum) genome annotations. Data Brief 2024, 54, 110401. [Google Scholar] [CrossRef] [PubMed]
- Ferrari, C.; Manosalva Pérez, N.; Vandepoele, K. MINI-EX: Integrative inference of single-cell gene regulatory networks in plants. Mol. Plant 2022, 15, 1807–1824. [Google Scholar] [CrossRef]
- Staut, J.; Pérez, N.M.; Depuydt, T.; Vandepoele, K.; Lukicheva, S. MINI-EX Version 2: Cell-type-specific gene regulatory network inference using an integrative single-cell transcriptomics approach. Methods Mol. Biol. 2026, 2985, 159–191. [Google Scholar] [CrossRef]
- Ovek Baydar, D.; Rauluseviciute, I.; Aronsen, D.R.; Blanc-Mathieu, R.; Bonthuis, I.; de Beukelaer, H.; Ferenc, K.; Jegou, A.; Kumar, V.; Lemma, R.B.; et al. JASPAR 2026: Expansion of transcription factor binding profiles and integration of deep learning models. Nucleic Acids Res. 2026, 54, 184–193. [Google Scholar] [CrossRef] [PubMed]
- Dobin, A.; Davis, C.A.; Schlesinger, F.; Drenkow, J.; Zaleski, C.; Jha, S.; Batut, P.; Chaisson, M.; Gingeras, T.R. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 2012, 29, 15–21. [Google Scholar] [CrossRef] [PubMed]
- Kaminow, B.; Yunusov, D.; Dobin, A. STARsolo: Accurate, fast and versatile mapping/quantification of single-cell and single-nucleus RNA-seq data. bioRxiv 2021, 5, 442755. [Google Scholar] [CrossRef]
- Wolf, F.A.; Angerer, P.; Theis, F.J. SCANPY: Large-scale single-cell gene expression data analysis. Genome Biol. 2018, 19, 15. [Google Scholar] [CrossRef] [PubMed]
- Wolock, S.L.; Lopez, R.; Klein, A.M. Scrublet: Computational identification of cell doublets in single-cell transcriptomic data. Cell Syst. 2019, 8, 281–291.e289. [Google Scholar] [CrossRef]
- Becht, E.; McInnes, L.; Healy, J.; Dutertre, C.-A.; Kwok, I.W.H.; Ng, L.G.; Ginhoux, F.; Newell, E.W. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 2019, 37, 38–44. [Google Scholar] [CrossRef] [PubMed]
- Gayoso, A.; Weiler, P.; Lotfollahi, M.; Klein, D.; Hong, J.; Streets, A.; Theis, F.J.; Yosef, N. Deep generative modeling of transcriptional dynamics for RNA velocity analysis in single cells. Nat. Methods 2024, 21, 50–59. [Google Scholar] [CrossRef]
- Weiler, P.; Lange, M.; Klein, M.; Pe’er, D.; Theis, F. CellRank 2: Unified fate mapping in multiview single-cell data. Nat. Methods 2024, 21, 1196–1205. [Google Scholar] [CrossRef]
- Gulati, G.S.; Sikandar, S.S.; Wesche, D.J.; Manjunath, A.; Bharadwaj, A.; Berger, M.J.; Ilagan, F.; Kuo, A.H.; Hsieh, R.W.; Cai, S.; et al. Single-cell transcriptional diversity is a hallmark of developmental potential. Science 2020, 367, 405–411. [Google Scholar] [CrossRef]
- Morabito, S.; Reese, F.; Rahimzadeh, N.; Miyoshi, E.; Swarup, V. hdWGCNA identifies co-expression networks in high-dimensional transcriptomics data. Cell Rep. Methods 2023, 3, 100498. [Google Scholar] [CrossRef]
- Hao, Y.; Stuart, T.; Kowalski, M.H.; Choudhary, S.; Hoffman, P.; Hartman, A.; Srivastava, A.; Molla, G.; Madad, S.; Fernandez-Granda, C.; et al. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat. Biotechnol. 2024, 42, 293–304. [Google Scholar] [CrossRef]
- Li, H.; Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009, 25, 1754–1760. [Google Scholar] [CrossRef]
- Grant, C.E.; Bailey, T.L.; Noble, W.S. FIMO: Scanning for occurrences of a given motif. Bioinformatics 2011, 27, 1017–1018. [Google Scholar] [CrossRef]
- Kamimoto, K.; Stringa, B.; Hoffmann, C.M.; Jindal, K.; Solnica-Krezel, L.; Morris, S.A. Dissecting cell identity via network inference and in silico gene perturbation. Nature 2023, 614, 742–751. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. arXiv 2016, arXiv:1603.02754. [Google Scholar] [CrossRef]
- Zou, H.; Hastie, T.; Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. B 2005, 67, 301–320. [Google Scholar] [CrossRef]
- Osorio, D.; Zhong, Y.; Li, G.; Huang, J.Z.; Cai, J.J. scTenifoldNet: A machine learning workflow for constructing and comparing transcriptome-wide gene regulatory networks from single-cell data. Patterns 2020, 1, 100139. [Google Scholar] [CrossRef] [PubMed]
- Xu, S.; Hu, E.; Cai, Y.; Xie, Z.; Luo, X.; Zhan, L.; Tang, W.; Wang, Q.; Liu, B.; Wang, R.; et al. Using clusterProfiler to characterize multiomics data. Nat. Protoc. 2024, 19, 3292–3320. [Google Scholar] [CrossRef]
- Shi, D.; Sugimoto, K.; Fukushima, K. Decoding plant cell heterogeneity and dynamics across responses, development, to evolution with single-cell technologies. Curr. Opin. Plant Biol. 2026, 90, 102854. [Google Scholar] [CrossRef] [PubMed]
- Kim, J.-Y.; Symeonidi, E.; Pang, T.Y.; Denyer, T.; Weidauer, D.; Bezrutczyk, M.; Miras, M.; Zöllner, N.; Hartwig, T.; Wudick, M.M.; et al. Distinct identities of leaf phloem cells revealed by single cell transcriptomics. Plant Cell 2021, 33, 511–530. [Google Scholar] [CrossRef]
- Ma, F.; Zheng, C. Single-cell phylotranscriptomics of developmental and cell type evolution. Trends Genet. 2024, 40, 495–510. [Google Scholar] [CrossRef]
- Nguyen, C.C.; Thibivilliers, S.; Li, Y.; Fazekas, C.T.; Yang, E.J.Y.; Asiamah, J.Y.; Peláez-Vico, M.Á.; Castro-Guerrero, N.; Mendoza-Cozatl, D.; Martin, O.C.; et al. Uncovering the core genetic programs governing plant guard cell biology. New Phytol. 2026, 249, 198–217. [Google Scholar] [CrossRef] [PubMed]
- Yu, L.; Zhang, Y.; Ding, Q.; Wang, H.; Meng, X.; Fan, H.; Yu, Y.; Cui, N. The SlMYC1-TOR module regulates trichome formation and terpene biosynthesis in tomatoes (Solanum lycopersicum L.). J. Plant Growth Regul. 2024, 43, 3282–3294. [Google Scholar] [CrossRef]
- Deng, H.; Ru, J.; Liang, Z.; Tang, Z.; Wang, Y.; Yuan, W.; Li, L.; Feng, Y.; Gao, X. Single-cell transcriptomics reveals cellular and genetic mechanisms of alpine adaptation in Rosa sericea. Front. Plant Sci. 2026, 17, 1733247. [Google Scholar] [CrossRef]
- Fernández, J.D.; Navarro-Payá, D.; Santiago, A.; Cerda, A.; Canan, J.; Contreras-Riquelme, S.; Moyano, T.C.; Landaeta-Sepúlveda, D.; Melet, L.; Canales, J.; et al. Organ-level gene-regulatory networks inferred from transcriptomic data reveal context-specific regulation and highlight novel regulators of ripening and ABA-mediated responses in tomato. Plant Commun. 2025, 6, 101499. [Google Scholar] [CrossRef] [PubMed]
- Delannoy, E.; Batardiere, B.; Pateyron, S.; Soubigou-Taconnat, L.; Chiquet, J.; Colcombet, J.; Lang, J. Cell specialization and coordination in Arabidopsis leaves upon pathogenic attack revealed by scRNA-seq. Plant Commun. 2023, 4, 100676. [Google Scholar] [CrossRef]
- Yuan, C.U.; Quah, F.X.; Hemberg, M. Single-cell and spatial transcriptomics: Bridging current technologies with long-read sequencing. Mol. Asp. Med. 2024, 96, 101255. [Google Scholar] [CrossRef]
- Lopez-Anido, C.B.; Vatén, A.; Smoot, N.K.; Sharma, N.; Guo, V.; Gong, Y.; Anleu Gil, M.X.; Weimer, A.K.; Bergmann, D.C. Single-cell resolution of lineage trajectories in the Arabidopsis stomatal lineage and developing leaf. Dev. Cell 2021, 56, 1043–1055.e1044. [Google Scholar] [CrossRef]
- Corchete, L.A.; Rojas, E.A.; Alonso-López, D.; De Las Rivas, J.; Gutiérrez, N.C.; Burguillo, F.J. Systematic comparison and assessment of RNA-seq procedures for gene expression quantitative analysis. Sci. Rep. 2020, 10, 19737. [Google Scholar] [CrossRef]
- Mahood, E.H.; Kruse, L.H.; Moghe, G.D. Machine learning: A powerful tool for gene function prediction in plants. Appl. Plant Sci. 2020, 8, e11376. [Google Scholar] [CrossRef]
- Zheng, M.; Wang, X.; Luo, J.; Ma, B.; Li, D.; Chen, X. The pleiotropic functions of GOLDEN2-LIKE transcription factors in plants. Front. Plant Sci. 2024, 15, 1445875. [Google Scholar] [CrossRef]
- Wang, K.; Guo, H.; Yin, Y. AP2/ERF transcription factors and their functions in Arabidopsis responses to abiotic stresses. Environ. Exp. Bot. 2024, 222, 105763. [Google Scholar] [CrossRef]
- Yu, S.; Wang, H.; Garcia-Caparros, P.; Liu, M. Revisiting the functions of ethylene response factors (ERFs) in tomato. Plant Horm. 2025, 1, e008. [Google Scholar] [CrossRef]
- Jo, L.; Buti, S.; Artur, M.A.S.; Kluck, R.M.C.; Cantó-Pastor, A.; Brady, S.M.; Kajala, K. Transcription factors SlMYB41, SlMYB92, and SlWRKY71 regulate gene expression in the tomato exodermis. J. Exp. Bot. 2025, 76, 6472–6486. [Google Scholar] [CrossRef] [PubMed]
- Zhao, Y.-Q.; Sun, C.; Hu, K.-D.; Yu, Y.; Liu, Z.; Song, Y.-C.; Xiong, R.-J.; Ma, Y.; Zhang, H.; Yao, G.-F. A transcription factor SlWRKY71 activated the H2S generating enzyme SlDCD1 enhancing the response to Pseudomonas syringae pv DC3000 in tomato leaves. New Phytol. 2025, 246, 262–279. [Google Scholar] [CrossRef]
- Zhang, M.; Hu, K.; Ma, L.; Geng, M.; Zhang, C.; Yao, G.; Zhang, H. Persulfidation and phosphorylation of transcription factor SlWRKY6 differentially regulate tomato fruit ripening. Plant Physiol. 2024, 196, 210–227. [Google Scholar] [CrossRef]
- Okada, D.; Zheng, C.; Cheng, J.H. Mathematical model for the relationship between single-cell and bulk gene expression to clarify the interpretation of bulk gene expression data. Comput. Struct. Biotechnol. J. 2022, 20, 4850–4859. [Google Scholar] [CrossRef]
- Heydari, T.; Langley, M.A.; Fisher, C.L.; Aguilar-Hidalgo, D.; Shukla, S.; Yachie-Kinoshita, A.; Hughes, M.; McNagny, K.M.; Zandstra, P.W. IQCELL: A platform for predicting the effect of gene perturbations on developmental trajectories using single-cell RNA-seq data. PLoS Comput. Biol. 2022, 18, e1009907. [Google Scholar] [CrossRef]
- Peng, W.; Yang, Y.; Xu, J.; Peng, E.; Dai, S.; Dai, L.; Wang, Y.; Yi, T.; Wang, B.; Li, D.; et al. TALE Transcription factors in sweet orange (Citrus sinensis): Genome-wide identification, characterization, and expression in response to biotic and abiotic stresses. Front. Plant Sci. 2022, 12, 814252. [Google Scholar] [CrossRef] [PubMed]
- Shahan, R.; Hsu, C.-W.; Nolan, T.M.; Cole, B.J.; Taylor, I.W.; Greenstreet, L.; Zhang, S.; Afanassiev, A.; Vlot, A.H.C.; Schiebinger, G.; et al. A single-cell Arabidopsis root atlas reveals developmental trajectories in wild-type and cell identity mutants. Dev. Cell 2022, 57, 543–560.e549. [Google Scholar] [CrossRef] [PubMed]
- Molla Desta, G.; Birhanu, A.G. Advancements in single-cell RNA sequencing and spatial transcriptomics: Transforming biomedical research. Acta Biochim. Pol. 2025, 72, 13922. [Google Scholar] [CrossRef] [PubMed]
- Kernfeld, E.; Keener, R.; Cahan, P.; Battle, A. Transcriptome data are insufficient to control false discoveries in regulatory network inference. Cell Syst. 2024, 15, 709–724.e713. [Google Scholar] [CrossRef]
- Buenrostro, J.D.; Wu, B.; Litzenburger, U.M.; Ruff, D.; Gonzales, M.L.; Snyder, M.P.; Chang, H.Y.; Greenleaf, W.J. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 2015, 523, 486–490. [Google Scholar] [CrossRef]





Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Shen, H.; Liu, W.; Li, Y.; He, Z.; Yang, Z.; Hu, Z.; Wu, T. Deciphering Cell-Type-Specific Transcriptional Regulation in Tomato Leaves Through Ensemble Machine Learning and Single-Cell Transcriptomics. Plants 2026, 15, 1578. https://doi.org/10.3390/plants15101578
Shen H, Liu W, Li Y, He Z, Yang Z, Hu Z, Wu T. Deciphering Cell-Type-Specific Transcriptional Regulation in Tomato Leaves Through Ensemble Machine Learning and Single-Cell Transcriptomics. Plants. 2026; 15(10):1578. https://doi.org/10.3390/plants15101578
Chicago/Turabian StyleShen, Hui, Wen Liu, Yuanheng Li, Zhaoyilan He, Zheng’an Yang, Zongli Hu, and Ting Wu. 2026. "Deciphering Cell-Type-Specific Transcriptional Regulation in Tomato Leaves Through Ensemble Machine Learning and Single-Cell Transcriptomics" Plants 15, no. 10: 1578. https://doi.org/10.3390/plants15101578
APA StyleShen, H., Liu, W., Li, Y., He, Z., Yang, Z., Hu, Z., & Wu, T. (2026). Deciphering Cell-Type-Specific Transcriptional Regulation in Tomato Leaves Through Ensemble Machine Learning and Single-Cell Transcriptomics. Plants, 15(10), 1578. https://doi.org/10.3390/plants15101578

