PathCare: Integrating Clinical Pathway Information to Enable Healthcare Prediction at the Neuron Level
Abstract
:1. Introduction
- Methodologically, we propose PathCare, a framework for learning future-predictive health representations, designed to model long-term clinical pathways despite missing follow-ups. We design a longitudinal-aware auxiliary sub-network that predicts future clinical visits and encodes health status from a temporal perspective. We also introduce a neuron-level filtering mechanism that adaptively selects target-predictive features, encouraging diversity among hidden units while preserving critical information. PathCare provides refined longitudinal modeling and feature selection, with disrupted health trajectories elaborately modeled, resulting in more robust patient representations across irregular visit patterns.
- Experimentally, evaluations on CDSL, MIMIC-III, and MIMIC-IV datasets demonstrate PathCare’s superior performance in mortality and readmission prediction tasks, achieving relative AUPRC improvements of up to 2.43% and consistently higher min(+P, Se) values across all datasets. Our model maintains robust performance even under extreme data sparsity conditions and shows particular effectiveness for patients with regular missing visits. Ablation studies confirm that both the pathway prediction component and adaptive filtering mechanism contribute significantly to performance gains, validating our approach to clinical trajectory modeling in real-world settings with varying data completeness.
2. Related Work
2.1. Direct Data Space Methods
2.2. Indirect Representation Space Methods
3. Preliminary
3.1. A Motivating Example
3.2. Problem Formulation
4. Method
- We explicitly model the clinical status pathway by training a Gated Recurrent Unit (GRU)-based auxiliary sub-network (i.e., the left GRU) to predict the lab tests and clinical events recorded in a future visit (). The hidden representation of the sequence () is encoded to be a good predictor of future status, and it is provided as extra clinical features for the supervised clinical prediction task. This helps the model to depict the health status from a long-term perspective.
- A task-specific GRU is applied to extract the other part of the health status representation. The model merges the task-specific representation and the auxiliary representation to perform the target prediction. We encourage the diversity among hidden units based on layer decorrelation to help the useful units stand out (i.e., denoted as red circles). A neuron-level gate is designed to filter out the units that are useless to the target prediction (i.e., denoted as blue squares on the left side, and blue triangles on the right side) and reduce the redundancy of the model.
4.1. Auxiliary Task for Clinical Pathway Modeling
4.2. Neuron-Level Filtering Gate
5. Experimental Setups
5.1. Datasets
5.2. Evaluation Metrics
5.3. Prediction Tasks
5.4. Baseline Models
5.4.1. EHR-Specific Models
- RETAIN [2] utilizes a two-level neural attention mechanism to detect influential visits and significant clinical variables within those visits. It processes EHR data in reverse time order, mimicking physician practice by giving higher attention to recent clinical visits.
- SAFARI [6] learns compact patient health representations by imposing a correlational sparsity prior to the correlations of medical feature pairs. It solves a bi-level optimization problem involving high-level inter-group correlations and lower-level intra-group correlations, using Laplacian kernel and graph neural networks.
- AdaCare [16] captures the long- and short-term variations of biomarkers as clinical features to represent health status across multiple time scales. It models correlations between clinical features to enhance those that strongly indicate health status, maintaining high prediction accuracy while providing interpretability.
- GRASP [4] enhances patient representation learning by leveraging knowledge from similar patients. It defines similarities between patients for different clinical tasks, finds similar patients with useful information, and learns cohort representation to extract valuable knowledge.
5.4.2. General Deep Learning Models
- RNN is a standard recurrent neural network model applied to sequential medical data, serving as a fundamental baseline.
- is a basic GRU model with an addition-based attention mechanism, serving as a strong baseline for healthcare prediction tasks.
- Mamba [9] is a linear-time sequence modeling architecture based on selective state spaces. It allows the model to selectively propagate or forget information along the sequence length dimension, depending on the current token, making it suitable for processing long sequences of medical data.
5.4.3. Ablation Models
- removes the future clinical pathway context module, focusing solely on current information for prediction.
- directly concatenates auxiliary and task-specific embeddings without the neuron-level filtering gate, maintaining potential redundancy between embeddings.
5.5. Implementation Details
6. Experimental Results and Analysis
6.1. Experimental Results
6.2. Ablation Study
6.3. Observations and Analysis
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Online Resources
- We have released the source code at https://github.com/fuxius/PathCare (accessed on 3 April 2025).
- The MIMIC-III dataset is provided at https://physionet.org/content/mimiciii/1.4/ (accessed on 4 September 2016).
- The MIMIC-IV dataset is provided at https://physionet.org/content/mimiciv/2.2/ (accessed on 6 January 2023).
- The HM Hospitals COVID-19 Collaborator is provided at https://www.hmhospitales.com/prensa/notas-de-prensa/comunicado-covid-data-save-lives (accessed on 4 June 2024).
References
- Chen, J.; Zhang, A. Hgmf: Heterogeneous graph-based fusion for multimodal data with incompleteness. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, 6–10 July 2020; pp. 1295–1305. [Google Scholar]
- Choi, E.; Bahadori, M.T.; Sun, J.; Kulas, J.; Schuetz, A.; Stewart, W. Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In Proceedings of the Advances in Neural Information Processing Systems 29 (NIPS 2016), Barcelona, Spain, 5–10 December 2016; Volume 29. [Google Scholar]
- Ma, L.; Zhang, C.; Gao, J.; Jiao, X.; Yu, Z.; Zhu, Y.; Wang, T.; Ma, X.; Wang, Y.; Tang, W.; et al. Mortality prediction with adaptive feature importance recalibration for peritoneal dialysis patients. Patterns 2023, 4, 100892. [Google Scholar] [CrossRef] [PubMed]
- Zhang, C.; Gao, X.; Ma, L.; Wang, Y.; Wang, J.; Tang, W. GRASP: Generic framework for health status representation learning based on incorporating knowledge from similar patients. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 2–9 February 2021; Volume 35, pp. 715–723. [Google Scholar]
- Harutyunyan, H.; Khachatrian, H.; Kale, D.C.; Galstyan, A. Multitask learning and benchmarking with clinical time series data. arXiv 2017, arXiv:1703.07771. [Google Scholar] [CrossRef] [PubMed]
- Ma, X.; Wang, Y.; Chu, X.; Ma, L.; Tang, W.; Zhao, J.; Yuan, Y.; Wang, G. Patient health representation learning via correlational sparse prior of medical features. IEEE Trans. Knowl. Data Eng. 2022, 35, 11769–11783. [Google Scholar] [CrossRef]
- Zhang, Z.; Zhang, Q.; Jiao, Y.; Lu, L.; Ma, L.; Liu, A.; Liu, X.; Zhao, J.; Xue, Y.; Wei, B.; et al. Methodology and real-world applications of dynamic uncertain causality graph for clinical diagnosis with explainability and invariance. Artif. Intell. Rev. 2024, 57, 151. [Google Scholar] [CrossRef]
- Chowdhury, R.R.; Li, J.; Zhang, X.; Hong, D.; Gupta, R.K.; Shang, J. Primenet: Pre-training for irregular multivariate time series. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington DC, USA, 7–14 February 2023; Volume 37, pp. 7184–7192. [Google Scholar]
- Gu, A.; Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. arXiv 2023, arXiv:2312.00752. [Google Scholar]
- Zhang, C.; Chu, X.; Ma, L.; Zhu, Y.; Wang, Y.; Wang, J.; Zhao, J. M3care: Learning with missing modalities in multimodal healthcare data. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14–18 August 2022; pp. 2418–2428. [Google Scholar]
- Johnson, A.E.; Pollard, T.J.; Shen, L.; Li-wei, H.L.; Feng, M.; Ghassemi, M.; Moody, B.; Szolovits, P.; Celi, L.A.; Mark, R.G. MIMIC-III, a freely accessible critical care database. Sci. Data 2016, 3, 160035. [Google Scholar] [CrossRef] [PubMed]
- Johnson, A.E.; Bulgarelli, L.; Shen, L.; Gayles, A.; Shammout, A.; Horng, S.; Pollard, T.J.; Hao, S.; Moody, B.; Gow, B.; et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci. Data 2023, 10, 1. [Google Scholar] [CrossRef] [PubMed]
- Van Buuren, S.; Groothuis-Oudshoorn, K. mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 2011, 45, 1–67. [Google Scholar] [CrossRef]
- Choi, E.; Xiao, C.; Stewart, W.; Sun, J. Mime: Multilevel medical embedding of electronic health records for predictive healthcare. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 3–8 December 2018; pp. 4547–4557. [Google Scholar]
- Ma, L.; Zhang, C.; Wang, Y.; Ruan, W.; Wang, J.; Tang, W.; Ma, X.; Gao, X.; Gao, J. Concare: Personalized clinical feature embedding via capturing the healthcare context. In Proceedings of the AAAI conference on artificial intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 833–840. [Google Scholar]
- Ma, L.; Gao, J.; Wang, Y.; Zhang, C.; Wang, J.; Ruan, W.; Tang, W.; Gao, X.; Ma, X. Adacare: Explainable clinical health status representation learning via scale-adaptive feature extraction and recalibration. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 825–832. [Google Scholar]
- Che, Z.; Purushotham, S.; Cho, K.; Sontag, D.; Liu, Y. Recurrent neural networks for multivariate time series with missing values. Sci. Rep. 2018, 8, 6085. [Google Scholar] [CrossRef] [PubMed]
- Peters, M.E.; Ammar, W.; Bhagavatula, C.; Power, R. Semi-supervised sequence tagging with bidirectional language models. arXiv 2017, arXiv:1705.00108. [Google Scholar]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
- Cogswell, M.; Ahmed, F.; Girshick, R.; Zitnick, L.; Batra, D. Reducing overfitting in deep networks by decorrelating representations. arXiv 2015, arXiv:1511.06068. [Google Scholar]
- Shen, Y.; Tan, S.; Sordoni, A.; Courville, A. Ordered neurons: Integrating tree structures into recurrent neural networks. arXiv 2018, arXiv:1810.09536. [Google Scholar]
- HM Hospitales. Covid Data Save Lives. 2020. Available online: https://www.hmhospitales.com/prensa/notas-de-prensa/comunicado-covid-data-save-lives (accessed on 5 June 2024).
- Keilwagen, J.; Grosse, I.; Grau, J. Area under precision-recall curves for weighted and unweighted data. PLoS ONE 2014, 9, e92209. [Google Scholar] [CrossRef] [PubMed]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An imperative style, high-performance deep learning library. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Volume 32. [Google Scholar]
- Falcon, W.A. PyTorch Lightning; GitHub: San Francisco, CA, USA, 2019. [Google Scholar]
- Gao, J.; Zhu, Y.; Wang, W.; Wang, Z.; Dong, G.; Tang, W.; Wang, H.; Wang, Y.; Harrison, E.M.; Ma, L. A comprehensive benchmark for COVID-19 predictive modeling using electronic health records in intensive care. Patterns 2024, 5, 100951. [Google Scholar] [CrossRef] [PubMed]
- Zhu, Y.; Wang, W.; Gao, J.; Ma, L. PyEHR: A Predictive Modeling Toolkit for Electronic Health Records. 2023. Available online: https://github.com/yhzhu99/pyehr (accessed on 22 November 2024).
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Notation | Definition |
---|---|
Ground truth of prediction target at t-th visit | |
Prediction result at t-th visit | |
Multivariate visit record at t-th visit | |
Prediction result of the status of the next visit | |
Health status embedding learned to predict the future visit | |
Health status embedding learned to predict the primary target | |
Projection of future-specific embedding | |
Projection of task-specific embedding | |
Neuron-level gate for modeling the demand degree of a future visit | |
Mask for selecting units from | |
Mask for selecting units from | |
Combined health status representation for the target prediction | |
B | Batch size |
Covariances between all pairs of activations i and j in the layer | |
The i-th activation of at the b-th case in the batch | |
The sample mean of activation i over the batch | |
Projection matrix for future representations | |
Projection matrix for task-specific representations | |
Projection matrix for the filtering gate | |
Hyperparameter for the decorrelation loss term | |
Loss for predicting the next visit | |
Decorrelation loss term | |
Loss for the primary prediction task |
Dataset | Split | # Samples | # | # |
---|---|---|---|---|
CDSL | Train | 2978 (69.99%) | 378 (12.69%) | - |
Val | 426 (10.01%) | 54 (12.68%) | - | |
Test | 851 (20.00%) | 108 (12.69%) | - | |
MIMIC-III | Train | 16,094 (80.00%) | 4996 (31.04%) | 4996 (31.04%) |
Val | 3018 (15.00%) | 934 (30.95%) | 934 (30.95%) | |
Test | 1006 (5.00%) | 312 (31.01%) | 312 (31.01%) | |
MIMIC-IV | Train | 17,227 (70.00%) | 5095 (29.58%) | 5095 (29.58%) |
Val | 2461 (10.00%) | 724 (29.42%) | 724 (29.42%) | |
Test | 4922 (20.00%) | 1450 (29.46%) | 1450 (29.46%) |
Statistic | Total | Survived | Deceased |
---|---|---|---|
Number of patients | 4255 | 3715 (87.31%) | 540 (12.69%) |
Number of records | 123,044 | 108,142 (87.89%) | 14,902 (12.11%) |
Records per patient | 24.0 [15, 39] | 25.0 [15, 39] | 22.5 [11, 37] |
Age | 67.2 [56.0, 80.0] | 65.1 [54.0, 77.0] | 81.6 [75.0, 89.0] |
Age > Mean (67 years) | 2228 (52.36%) | 1748 (47.05%) | 480 (88.89%) |
Age ≤ Mean (67 years) | 2027 (47.64%) | 1967 (52.95%) | 60 (11.11%) |
Male | 2515 (59.11%) | 2173 (58.49%) | 342 (63.33%) |
Female | 1740 (40.89%) | 1542 (41.51%) | 198 (36.67%) |
Number of features | 99 | ||
Length of stay (days) | 6.4 [4.0, 11.0] | 6.1 [4.0, 11.0] | 6.0 [3.0, 10.0] |
Methods | CDSL Mortality | MIMIC-III Mortality | MIMIC-IV Mortality | ||||||
---|---|---|---|---|---|---|---|---|---|
AUPRC (↑) | AUROC (↑) | min(+P, Se) (↑) | AUPRC (↑) | AUROC (↑) | min(+P, Se) (↑) | AUPRC (↑) | AUROC (↑) | min(+P, Se) (↑) | |
RETAIN | 77.23 ± 4.13 | 93.67 ± 1.56 | 68.96 ± 4.14 | 45.60 ± 4.48 | 83.89 ± 2.12 | 28.74 ± 4.27 | 51.53 ± 1.40 | 86.61 ± 0.55 | 33.09 ± 1.77 |
RNN | 83.03 ± 2.97 | 95.55 ± 0.82 | 69.02 ± 5.06 | 47.63 ± 5.86 | 84.03 ± 1.93 | 28.73 ± 4.97 | 51.92 ± 1.67 | 85.09 ± 0.71 | 30.92 ± 1.42 |
SAFARI | 76.70 ± 4.02 | 94.42 ± 1.27 | 63.96 ± 3.99 | 48.32 ± 3.08 | 84.57 ± 0.94 | 25.31 ± 1.94 | 49.25 ± 1.88 | 85.21 ± 0.83 | 32.22 ± 1.73 |
AdaCare | 82.10 ± 4.02 | 94.78 ± 1.19 | 72.57 ± 3.60 | 51.19 ± 2.90 | 83.72 ± 0.86 | 25.76 ± 2.44 | 51.18 ± 1.53 | 83.79 ± 0.68 | 34.33 ± 1.60 |
GRASP | 83.60 ± 3.10 | 95.05 ± 0.96 | 71.12 ± 4.50 | 45.64 ± 6.29 | 83.24 ± 1.90 | 30.22 ± 5.81 | 52.63 ± 1.38 | 86.23 ± 0.61 | 30.69 ± 1.33 |
80.57 ± 3.83 | 95.31 ± 1.13 | 66.38 ± 3.51 | 52.24 ± 2.62 | 85.49 ± 0.72 | 24.69 ± 2.48 | 54.61 ± 1.37 | 86.16 ± 0.73 | 42.92 ± 1.68 | |
Mamba | 79.21 ± 3.73 | 92.28 ± 2.06 | 66.69 ± 4.73 | 51.33 ± 3.12 | 85.33 ± 0.89 | 26.05 ± 2.08 | 51.66 ± 1.32 | 84.29 ± 0.72 | 32.35 ± 1.95 |
82.56 ± 3.03 | 95.74 ± 0.93 | 72.48 ± 3.48 | 47.50 ± 4.83 | 83.99 ± 1.62 | 46.40 ± 0.42 | 51.67 ± 4.54 | 85.19 ± 1.70 | 52.47 ± 3.67 | |
82.11 ± 3.73 | 95.24 ± 1.13 | 74.72 ± 3.69 | 50.14 ± 4.68 | 85.36 ± 1.65 | 50.80 ± 0.38 | 51.86 ± 4.09 | 84.13 ± 1.73 | 51.76 ± 3.39 | |
PathCare | 84.11 ± 3.18 | 96.08 ± 0.97 | 76.55 ± 3.68 | 53.51 ± 4.40 | 85.63 ± 1.62 | 52.62 ± 0.16 | 54.19 ± 1.86 | 85.91 ± 0.65 | 52.62 ± 1.56 |
Methods | MIMIC-III Readmission | MIMIC-IV Readmission | ||||
---|---|---|---|---|---|---|
AUPRC (↑) | AUROC (↑) | min(+P, Se) (↑) | AUPRC (↑) | AUROC (↑) | min(+P, Se) (↑) | |
RETAIN | 48.98 ± 1.94 | 77.50 ± 1.14 | 25.02 ± 1.40 | 46.71 ± 1.77 | 77.53 ± 0.97 | 35.15 ± 1.76 |
RNN | 45.77 ± 2.13 | 74.34 ± 0.87 | 28.68 ± 1.89 | 48.72 ± 1.35 | 76.05 ± 0.81 | 27.12 ± 1.53 |
SAFARI | 46.65 ± 2.47 | 77.11 ± 1.30 | 25.25 ± 1.84 | 45.49 ± 1.82 | 76.70 ± 0.95 | 30.69 ± 1.64 |
AdaCare | 47.19 ± 2.40 | 76.97 ± 1.10 | 24.36 ± 1.70 | 46.87 ± 1.28 | 76.07 ± 0.84 | 26.40 ± 1.63 |
GRASP | 48.36 ± 2.09 | 76.70 ± 0.93 | 18.29 ± 1.50 | 50.23 ± 1.50 | 78.47 ± 0.88 | 29.19 ± 1.37 |
50.24 ± 2.08 | 78.36 ± 1.16 | 25.12 ± 1.43 | 50.97 ± 1.31 | 78.46 ± 0.86 | 33.80 ± 1.77 | |
Mamba | 45.98 ± 2.20 | 76.38 ± 1.06 | 24.50 ± 1.52 | 48.04 ± 1.42 | 76.87 ± 0.86 | 27.45 ± 1.63 |
47.13 ± 2.05 | 76.76 ± 0.92 | 47.17 ± 1.72 | 46.54 ± 1.61 | 76.98 ± 0.85 | 47.39 ± 1.41 | |
47.06 ± 2.11 | 75.43 ± 0.99 | 46.42 ± 1.77 | 50.42 ± 1.61 | 76.85 ± 0.88 | 48.61 ± 1.50 | |
PathCare | 51.01 ± 1.90 | 78.64 ± 0.88 | 50.44 ± 1.72 | 51.52 ± 1.61 | 78.41 ± 0.92 | 50.30 ± 1.40 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sui, D.; Gu, L.; Zhang, C.; Yang, K.; Li, X.; Ma, L.; Wang, L.; Tang, W. PathCare: Integrating Clinical Pathway Information to Enable Healthcare Prediction at the Neuron Level. Bioengineering 2025, 12, 578. https://doi.org/10.3390/bioengineering12060578
Sui D, Gu L, Zhang C, Yang K, Li X, Ma L, Wang L, Tang W. PathCare: Integrating Clinical Pathway Information to Enable Healthcare Prediction at the Neuron Level. Bioengineering. 2025; 12(6):578. https://doi.org/10.3390/bioengineering12060578
Chicago/Turabian StyleSui, Dehao, Lei Gu, Chaohe Zhang, Kaiwei Yang, Xiaocui Li, Liantao Ma, Ling Wang, and Wen Tang. 2025. "PathCare: Integrating Clinical Pathway Information to Enable Healthcare Prediction at the Neuron Level" Bioengineering 12, no. 6: 578. https://doi.org/10.3390/bioengineering12060578
APA StyleSui, D., Gu, L., Zhang, C., Yang, K., Li, X., Ma, L., Wang, L., & Tang, W. (2025). PathCare: Integrating Clinical Pathway Information to Enable Healthcare Prediction at the Neuron Level. Bioengineering, 12(6), 578. https://doi.org/10.3390/bioengineering12060578