A Multitask Active Learning Framework with Probabilistic Modeling for Multi-Species Acute Toxicity Prediction
Abstract
1. Introduction
- We introduce the Probabilistic Multitask Active Learning (PMAL) framework tailored for multi-species acute toxicity prediction, which unifies multi-task learning and active learning to jointly address data scarcity and label noise in toxicological modeling.
- We formulate a probabilistic multi-task model that explicitly accounts for both epistemic and aleatoric uncertainty; leveraging this dual uncertainty quantification, we further design an active learning query strategy that prioritizes high-uncertainty compounds for experimental labeling.
- We conduct comprehensive experiments across 59 diverse acute toxicity endpoints spanning multiple species, demonstrating consistent improvements over established state-of-the-art methods under realistic low-labeling-budget settings.
2. Related Work
3. Results
3.1. Experimental Settings
3.1.1. Dataset
3.1.2. Metrics
- Root Mean Squared Error (RMSE): Quantifies the average magnitude of prediction errors in the target variable. It is defined as follows:where n is the total number of samples, denotes the true value, and denotes the predicted value for the i-th sample.
- Coefficient of Determination (R2): Measures the proportion of variance in the observed outcomes explained by the model, computed as follows:where is the mean of the observed values.
3.1.3. Training Details
3.2. Comparative Experiments
3.2.1. Multi-Task Learning Setting
3.2.2. Active Learning Setting
3.3. Effectiveness Analysis
3.3.1. Probability Modeling
3.3.2. Multi-Task Training
3.3.3. Active Annotation
3.4. Hyperparameter Analysis
3.4.1. Number of Components
3.4.2. Balanced Factor
3.5. Visualizations
3.5.1. Prediction Visualizations
3.5.2. Uncertainty Visualizations
3.5.3. Representation Visualizations
4. Method
4.1. Notations
4.2. PMAL Framework
4.3. Probabilistic Multi-Task Learning
4.4. Uncertainty-Based Active Learning
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Vo, A.H.; Van Vleet, T.R.; Gupta, R.R.; Liguori, M.J.; Rao, M.S. An overview of machine learning and big data for drug toxicity evaluation. Chem. Res. Toxicol. 2019, 33, 20–37. [Google Scholar] [CrossRef]
- Hua, Y.; Shi, Y.; Cui, X.; Li, X. In silico prediction of chemical-induced hematotoxicity with machine learning and deep learning methods. Mol. Divers. 2021, 25, 1585–1596. [Google Scholar] [CrossRef] [PubMed]
- Tran, T.T.V.; Surya Wibowo, A.; Tayara, H.; Chong, K.T. Artificial intelligence in drug toxicity prediction: Recent advances, challenges, and future perspectives. J. Chem. Inf. Model. 2023, 63, 2628–2643. [Google Scholar] [CrossRef] [PubMed]
- Zhang, J.; Li, H.; Zhang, Y.; Huang, J.; Ren, L.; Zhang, C.; Zou, Q.; Zhang, Y. Computational toxicology in drug discovery: Applications of artificial intelligence in ADMET and toxicity prediction. Briefings Bioinform. 2025, 26, bbaf533. [Google Scholar] [CrossRef] [PubMed]
- Pérez Santín, E.; Rodríguez Solana, R.; González García, M.; García Suárez, M.D.M.; Blanco Díaz, G.D.; Cima Cabal, M.D.; Moreno Rojas, J.M.; López Sánchez, J.I. Toxicity prediction based on artificial intelligence: A multidisciplinary overview. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2021, 11, e1516. [Google Scholar] [CrossRef]
- Vandenhende, S.; Georgoulis, S.; Van Gansbeke, W.; Proesmans, M.; Dai, D.; Van Gool, L. Multi-task learning for dense prediction tasks: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 3614–3633. [Google Scholar] [CrossRef]
- Li, D.; Wang, Z.; Chen, Y.; Jiang, R.; Ding, W.; Okumura, M. A survey on deep active learning: Recent advances and new frontiers. IEEE Trans. Neural Netw. Learn. Syst. 2025, 36, 5879–5899. [Google Scholar] [CrossRef]
- Wu, Z.; Chen, J.; Li, Y.; Deng, Y.; Zhao, H.; Hsieh, C.Y.; Hou, T. From black boxes to actionable insights: A perspective on explainable artificial intelligence for scientific discovery. J. Chem. Inf. Model. 2023, 63, 7617–7627. [Google Scholar] [CrossRef]
- Gal, Y.; Ghahramani, Z. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of the International Conference on Machine Learning; PMLR; JMLR.org: Norfolk, MA, USA, 2016; pp. 1050–1059. [Google Scholar]
- Stauffer, C.; Grimson, W.E.L. Adaptive background mixture models for real-time tracking. In Proceedings of the 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149); IEEE: Piscataway, NJ, USA, 1999; Volume 2, pp. 246–252. [Google Scholar]
- Toropov, A.A.; Rasulev, B.F.; Leszczynski, J. QSAR modeling of acute toxicity for nitrobenzene derivatives towards rats: Comparative analysis by MLRA and optimal descriptors. QSAR Comb. Sci. 2007, 26, 686–693. [Google Scholar] [CrossRef]
- Mazzatorta, P.; Cronin, M.T.; Benfenati, E. A QSAR study of avian oral toxicity using support vector machines and genetic algorithms. QSAR Comb. Sci. 2006, 25, 616–628. [Google Scholar] [CrossRef]
- Polishchuk, P.G.; Muratov, E.N.; Artemenko, A.G.; Kolumbin, O.G.; Muratov, N.N.; Kuz’min, V.E. Application of random forest approach to QSAR prediction of aquatic toxicity. J. Chem. Inf. Model. 2009, 49, 2481–2488. [Google Scholar] [CrossRef]
- Zakharov, A.V.; Zhao, T.; Nguyen, D.T.; Peryea, T.; Sheils, T.; Yasgar, A.; Huang, R.; Southall, N.; Simeonov, A. Novel consensus architecture to improve performance of large-scale multitask deep learning QSAR models. J. Chem. Inf. Model. 2019, 59, 4613–4624. [Google Scholar] [CrossRef]
- Sosnin, S.; Karlov, D.; Tetko, I.V.; Fedorov, M.V. Comparative study of multitask toxicity modeling on a broad chemical space. J. Chem. Inf. Model. 2018, 59, 1062–1072. [Google Scholar] [CrossRef]
- Jain, S.; Siramshetty, V.B.; Alves, V.M.; Muratov, E.N.; Kleinstreuer, N.; Tropsha, A.; Nicklaus, M.C.; Simeonov, A.; Zakharov, A.V. Large-scale modeling of multispecies acute toxicity end points using consensus of multitask deep learning methods. J. Chem. Inf. Model. 2021, 61, 653–663. [Google Scholar] [CrossRef] [PubMed]
- Daghighi, A.; Casanola-Martin, G.M.; Iduoku, K.; Kusic, H.; Gonzalez-Diaz, H.; Rasulev, B. Multi-endpoint acute toxicity assessment of organic compounds using large-scale machine learning modeling. Environ. Sci. Technol. 2024, 58, 10116–10127. [Google Scholar] [CrossRef] [PubMed]
- Fu, L.; Shi, S.; Yi, J.; Wang, N.; He, Y.; Wu, Z.; Peng, J.; Deng, Y.; Wang, W.; Wu, C.; et al. ADMETlab 3.0: An updated comprehensive online ADMET prediction platform enhanced with broader coverage, improved performance, API functionality and decision support. Nucleic Acids Res. 2024, 52, W422–W431. [Google Scholar] [CrossRef] [PubMed]
- Gao, J.; Wu, L.; Lin, G.; Zou, J.; Yan, B.; Liu, K.; He, S.; Bo, X. Multi-task multi-view and iterative error-correcting random forest for acute toxicity prediction. Expert Syst. Appl. 2025, 274, 126972. [Google Scholar] [CrossRef]
- Wu, P.Y.; Chou, W.C.; Kamineni, V.N.; Chen, C.Y.; Hsieh, J.H.; Vulpe, C.D.; Lin, Z. Development of machine learning-based multi-task quantitative structure–activity relationship models for predicting toxicities in six human organ systems. Comput. Toxicol. 2025, 37, 100399. [Google Scholar] [CrossRef]
- Misra, I.; Shrivastava, A.; Gupta, A.; Hebert, M. Cross-stitch networks for multi-task learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2016; pp. 3994–4003. [Google Scholar]
- Ma, J.; Zhao, Z.; Yi, X.; Chen, J.; Hong, L.; Chi, E.H. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; Association for Computing Machinery: New York, NY, USA, 2018; pp. 1930–1939. [Google Scholar]
- Liu, S.; Johns, E.; Davison, A.J. End-to-end multi-task learning with attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2019; pp. 1871–1880. [Google Scholar]
- Ding, C.; Lu, Z.; Wang, S.; Cheng, R.; Boddeti, V.N. Mitigating task interference in multi-task learning via explicit task routing with non-learnable primitives. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2023; pp. 7756–7765. [Google Scholar]
- Kipf, T. Semi-supervised classification with graph convolutional networks. In Proceedings of the International Conference on Learning Representations; Curran Associates, Inc.: Red Hook, NY, USA, 2017; pp. 1–14. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. In Proceedings of the International Conference on Learning Representations; Curran Associates, Inc.: Red Hook, NY, USA, 2018; pp. 1–12. [Google Scholar]
- Xu, K.; Hu, W.; Leskovec, J.; Jegelka, S. How powerful are graph neural networks? In Proceedings of the International Conference on Learning Representations; Curran Associates, Inc.: Red Hook, NY, USA, 2019; pp. 1–17. [Google Scholar]
- Kendall, A.; Gal, Y.; Cipolla, R. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2018; pp. 7482–7491. [Google Scholar]
- Chen, Z.; Badrinarayanan, V.; Lee, C.Y.; Rabinovich, A. Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. In Proceedings of the International Conference on Machine Learning; PMLR; JMLR.org: Norfolk, MA, USA, 2018; pp. 794–803. [Google Scholar]
- He, Y.; Zhou, S.; Zhang, G.; Yun, H.; Xu, Y.; Zeng, B.; Chilimbi, T.; Zhao, H. Robust multi-task learning with excess risks. In Proceedings of the International Conference on Machine Learning; PMLR; JMLR.org: Norfolk, MA, USA, 2024; pp. 1–21. [Google Scholar]
- Yu, T.; Kumar, S.; Gupta, A.; Levine, S.; Hausman, K.; Finn, C. Gradient surgery for multi-task learning. Adv. Neural Inf. Process. Syst. 2020, 33, 5824–5836. [Google Scholar]
- Liu, B.; Liu, X.; Jin, X.; Stone, P.; Liu, Q. Conflict-averse gradient descent for multi-task learning. Adv. Neural Inf. Process. Syst. 2021, 34, 18878–18890. [Google Scholar]
- Wang, Z.; Tsvetkov, Y.; Firat, O.; Cao, Y. Gradient vaccine: Investigating and improving multi-task optimization in massively multilingual models. In Proceedings of the International Conference on Learning Representations; Openreview.net: Newton Highlands, MA, USA, 2021; pp. 1–22. [Google Scholar]
- Wang, D.; Shang, Y. A new active labeling method for deep learning. In Proceedings of the International Joint Conference on Neural Networks; IEEE: Piscataway, NJ, USA, 2014; pp. 112–119. [Google Scholar]
- Gal, Y.; Islam, R.; Ghahramani, Z. Deep bayesian active learning with image data. In Proceedings of the International Conference on Machine Learning; PMLR; JMLR.org: Norfolk, MA, USA, 2017; pp. 1–10. [Google Scholar]
- Kirsch, A.; Van Amersfoort, J.; Gal, Y. Batchbald: Efficient and diverse batch acquisition for deep bayesian active learning. Adv. Neural Inf. Process. Syst. 2019, 32, 7026–7037. [Google Scholar]
- Yoo, D.; Kweon, I.S. Learning loss for active learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2019; pp. 93–102. [Google Scholar]
- Kye, S.M.; Choi, K.; Byun, H.; Chang, B. TiDAL: Learning training dynamics for active learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision; IEEE: Piscataway, NJ, USA, 2023; pp. 22335–22345. [Google Scholar]
- Sener, O.; Savarese, S. Active learning for convolutional neural networks: A core-set approach. In Proceedings of the International Conference on Learning Representations; Curran Associates, Inc.: Red Hook, NY, USA, 2018; pp. 1–13. [Google Scholar]
- Yehuda, O.; Dekel, A.; Hacohen, G.; Weinshall, D. Active learning through a covering lens. Adv. Neural Inf. Process. Syst. 2022, 35, 22354–22367. [Google Scholar]
- Wu, L.; Yan, B.; Han, J.; Li, R.; Xiao, J.; He, S.; Bo, X. TOXRIC: A comprehensive database of toxicological data and benchmarks. Nucleic Acids Res. 2023, 51, D1432–D1445. [Google Scholar] [CrossRef]
- Parzen, E. On estimation of a probability density function and mode. Ann. Math. Stat. 1962, 33, 1065–1076. [Google Scholar] [CrossRef]
- Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
- Choi, J.; Elezi, I.; Lee, H.J.; Farabet, C.; Alvarez, J.M. Active learning for deep object detection via probabilistic modeling. In Proceedings of the IEEE/CVF International Conference on Computer Vision; IEEE: Piscataway, NJ, USA, 2021; pp. 10264–10273. [Google Scholar]






| Species | Labels | MTGCN [25] | MTGAT [26] | MTGIN [27] | PMAL |
|---|---|---|---|---|---|
| Mouse | 86,231 | 0.7795 ± 0.0204 | 0.7818 ± 0.0218 | 0.7825 ± 0.0103 | 0.7924 ± 0.0131 |
| Rat | 21,749 | 0.7255 ± 0.0089 | 0.7359 ± 0.0196 | 0.7356 ± 0.0157 | 0.7559 ± 0.0088 |
| Mammal | 2453 | 0.7132 ± 0.0094 | 0.7114 ± 0.0273 | 0.6987 ± 0.0079 | 0.7062 ± 0.0412 |
| Guinea Pig | 1743 | 0.9406 ± 0.0210 | 0.9293 ± 0.0117 | 0.9072 ± 0.0118 | 0.8091 ± 0.0385 |
| Rabbit | 4283 | 0.7138 ± 0.0109 | 0.7352 ± 0.0209 | 0.7187 ± 0.0169 | 0.7807 ± 0.0092 |
| Dog | 1534 | 0.7698 ± 0.0361 | 0.7639 ± 0.0241 | 0.7910 ± 0.0253 | 0.8361 ± 0.0165 |
| Cat | 693 | 1.1136 ± 0.0349 | 1.1623 ± 0.0037 | 1.0830 ± 0.0584 | 0.8833 ± 0.0828 |
| Bird | 329 | 0.6364 ± 0.0342 | 0.6654 ± 0.0307 | 0.6689 ± 0.0490 | 0.7325 ± 0.0539 |
| Quail | 349 | 0.7651 ± 0.0730 | 0.7605 ± 0.0157 | 0.7428 ± 0.0490 | 0.7026 ± 0.0635 |
| Duck | 186 | 1.1770 ± 0.0295 | 1.1528 ± 0.0462 | 1.1830 ± 0.1293 | 1.1836 ± 0.1012 |
| Chicken | 347 | 0.8196 ± 0.0594 | 0.9200 ± 0.0151 | 0.8702 ± 0.0574 | 0.7940 ± 0.0280 |
| Frog | 106 | 1.2954 ± 0.1203 | 1.2520 ± 0.0544 | 1.3084 ± 0.0585 | 1.2309 ± 0.0977 |
| Human | 381 | 1.0756 ± 0.0909 | 1.0534 ± 0.0590 | 1.1105 ± 0.0325 | 1.1618 ± 0.0224 |
| Mean | - | 0.8865 ± 0.0422 | 0.8941 ± 0.0269 | 0.8924 ± 0.0402 | 0.8745 ± 0.0444 |
| Species | Labels | MTGCN [25] | MTGAT [26] | MTGIN [27] | PMAL |
|---|---|---|---|---|---|
| Mouse | 86,231 | 0.3059 ± 0.0368 | 0.2825 ± 0.0369 | 0.2913 ± 0.0165 | 0.2828 ± 0.0179 |
| Rat | 21,749 | 0.3879 ± 0.0182 | 0.3824 ± 0.0316 | 0.3787 ± 0.0251 | 0.3506 ± 0.0150 |
| Mammal | 2453 | 0.1885 ± 0.0310 | 0.1961 ± 0.0677 | 0.2265 ± 0.0226 | 0.2328 ± 0.0916 |
| Guinea Pig | 1743 | −0.1206 ± 0.0767 | −0.0297 ± 0.0565 | 0.0436 ± 0.0472 | 0.3235 ± 0.0739 |
| Rabbit | 4283 | 0.3645 ± 0.0451 | 0.3325 ± 0.0276 | 0.3774 ± 0.0031 | 0.2685 ± 0.0480 |
| Dog | 1534 | 0.4616 ± 0.0567 | 0.4658 ± 0.0333 | 0.4330 ± 0.0323 | 0.3520 ± 0.0191 |
| Cat | 693 | −0.0676 ± 0.0717 | −0.1395 ± 0.0180 | −0.0031 ± 0.1387 | 0.3860 ± 0.1081 |
| Bird | 329 | 0.1811 ± 0.0891 | 0.1052 ± 0.0815 | 0.0939 ± 0.1336 | −0.0828 ± 0.1698 |
| Quail | 349 | 0.3744 ± 0.1157 | 0.3856 ± 0.0252 | 0.4122 ± 0.0760 | 0.4757 ± 0.1033 |
| Duck | 186 | 0.3352 ± 0.0335 | 0.3618 ± 0.0516 | 0.3233 ± 0.1425 | 0.3279 ± 0.1243 |
| Chicken | 347 | 0.2817 ± 0.1047 | 0.0979 ± 0.0297 | 0.1906 ± 0.1046 | 0.3275 ± 0.0477 |
| Frog | 106 | 0.2234 ± 0.1471 | 0.2779 ± 0.0634 | 0.2112 ± 0.0712 | 0.3028 ± 0.1191 |
| Human | 381 | −0.1243 ± 0.1911 | −0.0813 ± 0.0960 | −0.2018 ± 0.0870 | −0.1384 ± 0.0196 |
| Mean | - | 0.2148 ± 0.0783 | 0.2029 ± 0.0476 | 0.2136 ± 0.0693 | 0.2622 ± 0.0736 |
| Method | Round 1 | Round 2 | Round 3 | Round 4 | p Value |
|---|---|---|---|---|---|
| Random | 0.9465 ± 0.0169 | 0.9065 ± 0.0139 | 0.8751 ± 0.0078 | 0.8637 ± 0.0087 | - |
| Entropy [34] | 0.9465 ± 0.0169 | 0.9180 ± 0.0137 | 0.8944 ± 0.0146 | 0.8917 ± 0.0156 | 0.0899 |
| CoreSet [39] | 0.9465 ± 0.0169 | 0.8687 ± 0.0077 | 0.8418 ± 0.0089 | 0.8418 ± 0.0104 | 0.0705 |
| ProbCover [40] | 0.9465 ± 0.0169 | 0.8585 ± 0.0239 | 0.8546 ± 0.0200 | 0.8504 ± 0.0176 | 0.1365 |
| TiDAL [38] | 0.9465 ± 0.0169 | 0.8673 ± 0.0154 | 0.8656 ± 0.0167 | 0.8647 ± 0.0166 | 0.2938 |
| PMAL | 0.9465 ± 0.0169 | 0.8741 ± 0.0082 | 0.8439 ± 0.0083 | 0.8277 ± 0.0096 | 0.0587 |
| Method | Round 1 | Round 2 | Round 3 | Round 4 | p Value |
|---|---|---|---|---|---|
| Random | 0.0550 ± 0.0345 | 0.1226 ± 0.0323 | 0.1788 ± 0.0100 | 0.1968 ± 0.0121 | - |
| Entropy [34] | 0.0550 ± 0.0345 | 0.1188 ± 0.0215 | 0.1666 ± 0.0209 | 0.1699 ± 0.0222 | 0.1700 |
| CoreSet [39] | 0.0550 ± 0.0345 | 0.1702 ± 0.0321 | 0.2258 ± 0.0224 | 0.2323 ± 0.0212 | 0.0622 |
| ProbCover [40] | 0.0550 ± 0.0345 | 0.2158 ± 0.0371 | 0.2230 ± 0.0308 | 0.2326 ± 0.0239 | 0.1094 |
| TiDAL [38] | 0.0550 ± 0.0345 | 0.2087 ± 0.0277 | 0.2131 ± 0.0297 | 0.2131 ± 0.0297 | 0.1646 |
| PMAL | 0.0550 ± 0.0345 | 0.1759 ± 0.0200 | 0.2263 ± 0.0166 | 0.2514 ± 0.0166 | 0.0586 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Han, T.; Wang, J.; Zhao, Y.; Lin, Y.; Yu, L.; He, S.; Zan, P.; Bo, X. A Multitask Active Learning Framework with Probabilistic Modeling for Multi-Species Acute Toxicity Prediction. Molecules 2026, 31, 1144. https://doi.org/10.3390/molecules31071144
Han T, Wang J, Zhao Y, Lin Y, Yu L, He S, Zan P, Bo X. A Multitask Active Learning Framework with Probabilistic Modeling for Multi-Species Acute Toxicity Prediction. Molecules. 2026; 31(7):1144. https://doi.org/10.3390/molecules31071144
Chicago/Turabian StyleHan, Tianyu, Jingjing Wang, Yanpeng Zhao, Ying Lin, Lu Yu, Song He, Peng Zan, and Xiaochen Bo. 2026. "A Multitask Active Learning Framework with Probabilistic Modeling for Multi-Species Acute Toxicity Prediction" Molecules 31, no. 7: 1144. https://doi.org/10.3390/molecules31071144
APA StyleHan, T., Wang, J., Zhao, Y., Lin, Y., Yu, L., He, S., Zan, P., & Bo, X. (2026). A Multitask Active Learning Framework with Probabilistic Modeling for Multi-Species Acute Toxicity Prediction. Molecules, 31(7), 1144. https://doi.org/10.3390/molecules31071144

