A Multimodal Framework for Prognostic Modelling of Mental Health Treatment and Recovery Trajectories
Abstract
1. Introduction
1.1. The Prognostic Imperative in Psychiatry
1.2. A Recap of the Prognostic Theory
- The Patient State Vector (PSV): A comprehensive, multi-modal, and time-varying representation of a patient’s state, analogous to the initial state assessment in engineering. It integrates clinical, biological, and high-frequency digital phenotype data to create a high-dimensional characterisation of the individual.
- The Therapeutic Impulse Function (TIF): A formal characterisation of a treatment’s properties (e.g., pharmacodynamics, therapeutic modality), analogous to the “future loading conditions” in an engineering system. It defines the specific perturbation being applied to the patient’s system.
- The Predicted Recovery Trajectory (PRT): The forecasted, continuous path of a patient’s symptom severity over time, analogous to the “remaining useful life” prediction. The theory’s central thesis is that the PRT is an emergent property of the dynamic interaction between an individual’s unique PSV and a specific TIF.
1.3. Objective: From Theory to a Testable Methodology
1.4. Scope and Contributions
- A Novel Taxonomy for Prognosis: The operationalisation of the “Patient State Vector” (PSV) and “Therapeutic Impulse Function” (TIF) provides a formal vocabulary for modelling the dynamic interaction between patient and treatment, distinct from static baseline predictors.
- A Time-Aware Architecture Specification: We specify a deep learning architecture explicitly designed for the irregularity of real-world clinical data, addressing the “asynchrony” problem that renders standard RNNs ineffective in practice.
- A Translational Roadmap: Unlike purely technical papers, we provide an integrated implementation strategy that treats Explainable AI (XAI) and ethical governance as architectural requirements, not optional add-ons.
2. From Theory to Data: Operationalising the Prognostic Constructs
2.1. The Patient State Vector (PSV): A Multi-Modal, High-Frequency Data Architecture
2.1.1. Clinical Data
- Data Sources: Electronic Health Records (EHRs), clinical interviews, and Patient-Reported Outcome Measures (PROMs).
- Metrics:
- –
- Symptom Severity: Standardised scales such as the Patient Health Questionnaire-9 (PHQ-9) and the Hamilton Depression Rating Scale (HAM-D), collected at regular intervals (e.g., weekly) to track the evolution of symptoms.
- –
- Diagnostic History: Coded diagnoses (e.g., ICD-10) for MDD, comorbidities (e.g., anxiety disorders, substance use disorders), and history of prior treatment attempts.
- –
- Demographics: Age, gender, socioeconomic status, and other relevant demographic variables.
- Feature Engineering: Raw scores from clinical scales are used directly. Diagnostic history can be one-hot encoded. Longitudinal scores form a low-frequency time-series input to the model. Data harmonisation and standardisation protocols are essential to ensure comparability across sources and enable reproducible research.
2.1.2. Biological Data
- Data Sources: Genetic assays, neuroimaging scans, and blood or saliva samples.
- Metrics:
- –
- Pharmacogenetics: Genotypes for key genes influencing drug metabolism (e.g., cytochrome P450 enzymes like CYP2D6, CYP2C19) or drug targets [30].
- –
- –
- Feature Engineering: Genetic polymorphisms are encoded as categorical variables; neuroimaging and biomarker data are represented as numerical features. Because these data are static or extremely low-frequency, they function as conditioning variables that modulate the model’s dynamic predictions rather than forming a temporal sequence. Clear documentation of acquisition protocols and feature preprocessing is necessary to ensure replicability and comparability across datasets [39].
2.1.3. Digital Phenotype Data
- Feature Engineering Pipeline: Raw sensor data is typically aggregated into daily or hourly summaries to align with the temporal scale of mood fluctuations. Major behavioural domains and representative features include:
- –
- Mobility Patterns: Raw GPS coordinates are processed to derive features reflecting behavioural activation and social withdrawal. These include:
- Location Variance: The statistical variance of latitude and longitude, capturing the geographic spread of a person’s movement.
- Entropy: A measure of the diversity and predictability of visited locations. Lower entropy suggests a more restricted and repetitive routine.
- Time Spent at Home: The proportion of time a user’s device is located at their inferred home location.
- Distance Travelled: Total daily distance covered.
- –
- Social Activity: Call and SMS text message logs (metadata only, not content) are used to quantify social engagement. Features include:
- Number and Duration of Incoming/Outgoing Calls: A proxy for active social interaction.
- Number of Unique Contacts: A measure of social network size.
- Text Message Frequency: A proxy for passive social communication.
- –
- Circadian Rhythms: The regularity of daily routines, particularly sleep-wake cycles, is a critical indicator of mental health. These patterns can be inferred from multiple sensors:
- Accelerometer Data: Periods of prolonged inactivity from a wearable can more directly measure sleep.
- Circadian Regularity: A metric quantifying the stability of these patterns from day to day.
- –
- Physical Activity: Accelerometer data is used to derive standard activity metrics like daily step count and time spent in different activity intensity levels (e.g., sedentary, light, moderate).
Sensor Data Acquisition Pipeline
2.1.4. Data Integration and Handling
- Aggregation: High-frequency digital phenotype data is aggregated to a consistent temporal resolution (e.g., daily summaries).
- Imputation: For sparse data streams like weekly clinical scores or occasional biomarkers, appropriate imputation or interpolation techniques (e.g., mean imputation, forward-fill, or more sophisticated model-based imputation) must be applied.
- Asynchrony: The asynchronous nature of data collection across modalities is a core temporal modelling challenge. This can be mitigated at both the preprocessing and modelling stages. Architecturally, time-aware mechanisms, such as the gated structures described in Section 3, can explicitly represent time intervals between observations and learn from irregular sampling patterns [45,46,47].
2.2. The Therapeutic Impulse Function (TIF): A Formalism for Treatment Inputs
- Pharmacotherapy TIF: A vector representation for a medication would include:
- –
- Mechanism of Action: A multi-hot encoded vector representing the primary neurotransmitter systems targeted (e.g., -> for an SNRI).
- –
- Half-life Elimination : A numerical feature for the drug’s elimination half-life (in hours), which governs dosing frequency and time to steady state [30].
- –
- Metabolism: A categorical feature representing the primary CYP450 enzyme responsible for its metabolism (e.g., CYP2D6), allowing for direct modelling of gene-drug interactions.
- –
- Dose: A normalised numerical feature representing the prescribed daily dosage relative to the standard therapeutic range.
- Psychotherapy TIF: A vector representation for a psychotherapy would include:
- –
- Modality: A one-hot encoded vector for the primary therapeutic approach (e.g., Cognitive Behavioural Therapy, Psychodynamic Therapy, Interpersonal Therapy) [25].
- –
- Dose and Schedule: Numerical features for session duration (in minutes) and frequency (sessions per week).
- –
- Process Variables (if available): If data from session ratings is available, a numerical feature for the therapeutic alliance score could be included as a time-varying component of the TIF.
2.3. The Predicted Recovery Trajectory (PRT): Defining the Forecasting Target
3. A Dynamic Forecasting Architecture for the Predicted Recovery Trajectory
3.1. Model Selection Rationale: From State-Space Models to Recurrent Neural Networks
3.2. Proposed Architecture: A Multi-Input, Time-Aware LSTM for Prognosis
- Dynamic Inputs (): A sequence of time-varying vectors , where represents the Dynamic PSV features (e.g., symptom scores, aggregated digital phenotype metrics) at time step t.
- Static Inputs (s): A time-invariant vector representing the Static PSV (genetics, demographics) and the Therapeutic Impulse Function (TIF).
- Time Gaps (): To account for irregularity, we explicitly calculate the elapsed time between consecutive observations as , where t represents the absolute timestamp.
- Dynamic Pathway: The sequence is fed into the first LSTM layer to encode temporal dynamics: .
- Static Pathway: The static vector s is concatenated with the hidden state output of the first layer (⊕ denotes concatenation) to condition the high-level feature learning: . This fused vector serves as the input to the second LSTM layer: .
3.3. Training, Validation, and Evaluation Protocol
- Data Preprocessing: All continuous input features from the PSV and TIF will be standardised (e.g., z-score normalisation) to ensure they are on a comparable scale, a standard practice for training neural networks.
- Training: The model will be trained end-to-end using backpropagation through time (BPTT). The Adam optimiser, an adaptive learning rate optimisation algorithm, will be used to minimise the loss function. The primary loss function will be the Root Mean Squared Error (RMSE) calculated across all time points of the predicted and true trajectories.
- Validation: To prevent information leakage and ensure the model generalises to unseen patients, a patient-level, nested cross-validation scheme will be employed. The outer loop splits patients into training and test sets. The inner loop performs hyperparameter tuning (e.g., number of LSTM units, learning rate, dropout rate) on a validation set carved out from the training set. This approach ensures patient-level independence and provides a robust estimate of model generalisability.
- Evaluation Metrics: The evaluation must reflect the prognostic goal of forecasting the entire trajectory. While traditional classification metrics can be reported for benchmarking, the primary metrics should be:
- –
- Trajectory-wise Root Mean Squared Error (RMSE) or Mean Absolute Error (MAE): The average error calculated across all forecasted time points in the PRT. This is the primary measure of the model’s ability to accurately predict the entire path of recovery.
- –
- Endpoint MAE: The absolute error at the final time point of the forecast horizon (e.g., week 12). This allows for direct comparison with static models that only predict this single point.
- –
- Dynamic Time Warping (DTW): A metric that measures the similarity between two temporal sequences that may vary in speed. It is particularly useful for assessing whether the model has captured the correct shape of the recovery trajectory (e.g., rapid initial response followed by a plateau), even if it is slightly off in its timing.
Proposed Synthetic Validation Protocol
- Data Generation: We generate synthetic patient trajectories using a damped harmonic oscillator equation with injected noise, representing the cyclical but decaying nature of mood episodes.
- Irregular Sampling Simulation: We sample this ground-truth trajectory at random intervals , creating a “sparse” observation set typical of clinical data.
- Hypothesis Testing: We test the hypothesis that standard LSTMs will fail to capture the recovery rate because they treat the random as constant steps. The Time-Aware LSTM, using the gate , is expected to recover the underlying decay parameter despite the irregular sampling.
4. Bridging the Translational Gap: A Blueprint for Responsible Implementation
4.1. Ensuring Clinical Trust: From Black Box to Interpretable Prognosis (XAI)
- Feature Importance (Global and Local):
- –
- Technique: Methods like SHAP (Shapley Additive Explanations) can be used to quantify the contribution of each input feature to a specific prediction.
- –
- Clinical Application: After the model forecasts a PRT of non-response for a patient, SHAP can reveal the primary drivers of this prediction. For example, it might highlight that persistent sleep disruption (from the digital phenotype) and high baseline inflammatory markers are the top contributing factors. This transforms an opaque prediction (“non-responder”) into an interpretable clinical narrative (“The model predicts non-response, likely due to unresolved sleep and inflammation issues”), providing the clinician with a testable hypothesis to guide their next steps.
- Temporal Saliency:
- –
- Technique: If an attention mechanism is used, the attention weights themselves can be visualised to show which past time points the model focused on when making its forecast. Alternatively, gradient-based saliency methods can be used.
- –
- Clinical Application: This can identify critical periods in a patient’s illness course. For instance, the model might learn that a small dip in mood and social activity during week 3, even if it recovers, is a strong predictor of eventual relapse. This provides clinicians with insight into the importance of early, subtle fluctuations that might otherwise be dismissed. Such outputs can be visualised directly in the interface, allowing clinicians to inspect which temporal dynamics contributed most to a forecast—enhancing both trust and educational value.
- Counterfactual Explanations:
- –
- Technique: These methods generate “what-if” scenarios by minimally perturbing the input to change the model’s output.
- –
- Clinical Application: This provides the most actionable form of explanation. The system could answer questions like, “How would the predicted trajectory change if this patient’s daily step count increased by 2000?” or “What is the minimum improvement in sleep regularity needed to shift the forecast from non-response to response?” This bridges model output with behavioural and clinical levers of change, aligning algorithmic insight with therapeutic reasoning. This directly connects the model’s prediction to modifiable behaviours and helps in collaborative treatment planning between the clinician and patient.
4.2. An Ethical Framework for Prognostic AI
- Data Privacy and Informed Consent:
- –
- –
- Methodological Solution: Implementation of a dynamic consent interface within the data collection application. This would allow patients to view the data being collected, understand its purpose in simple terms, and have granular control to pause or withdraw specific data streams at any time. Furthermore, privacy-preserving machine learning techniques, such as federated learning, should be explored. In this approach, the model is trained on the user’s device, and only the updated model weights, not the raw personal data, are sent to a central server, significantly enhancing privacy [63]. Complementary approaches, such as differential privacy and secure multiparty computation, could further minimise re-identification risks in aggregated datasets.
- Algorithmic Bias and Fairness:
- –
- Challenge: AI models trained on historical data can learn and amplify existing societal biases and health disparities [23,64,65,66,67,68]. A model trained predominantly on one demographic group may perform poorly and unfairly on underrepresented populations, exacerbating inequities in care [69,70,71,72].
- –
- Methodological Solution:A mandatory algorithmic bias audit must be part of the model validation protocol. This involves disaggregating model performance metrics (e.g., Trajectory RMSE) across key demographic subgroups (e.g., race, gender, age, socioeconomic status) [73,74]. If significant performance disparities are found, mitigation strategies must be employed, such as re-weighting the training data, applying fairness constraints during training, or collecting more data from the underperforming subgroup [73,74]. Results from these audits should be reported transparently in Supplementary Materials, establishing accountability and reproducibility.
- Accountability and Responsibility:
- –
- Challenge: If the model’s forecast contributes to an adverse patient outcome, who is responsible? The developer, the hospital, or the clinician who used the tool? [75].
- –
- Methodological Solution: The framework must establish clear lines of accountability. The model must be legally and ethically framed as a Clinical Decision Support (CDS) tool, not a medical device that makes autonomous decisions. The final clinical judgement and responsibility must always reside with the human clinician. The system’s documentation and user interface must explicitly state its probabilistic nature, its limitations, and its role as an assistive tool to augment, not replace, professional expertise. Explicit audit logs should track how model outputs are used in clinical decisions, supporting traceability and learning from errors [75,76,77].
- Regulatory Compliance:
- –
- –
- Methodological Solution: The development process must incorporate a regulatory monitoring step, ensuring that data handling practices comply with existing regulations like HIPAA and GDPR, and that the tool’s classification (e.g., as a wellness app vs. a medical device) is appropriate and defended [59,73]. A risk-based framework should be adopted, where the level of regulatory scrutiny is proportional to the potential impact of the tool’s output on patient care [64,65,66,67,68].
4.3. Human-Centred Design for Clinical Integration (HCI)
- User-Centred Design Process:
- –
- Challenge: CDS tools are often designed by engineers with little understanding of clinical realities, leading to poor workflow integration and low adoption rates [75].
- –
- Methodological Solution: The design process must be iterative and participatory, involving clinicians, patients, and administrators from the earliest stages. Techniques like semi-structured interviews, workflow analysis (journey mapping), and the development of user personas can be used to deeply understand the needs, goals, and pain points of the end-users [75]. This ensures the final tool solves a real problem in a way that minimises, rather than increases, the clinician’s cognitive load. Iterative prototyping and usability metrics (e.g., task completion time, NASA-TLX workload) should be systematically recorded to ensure continuous alignment with user needs.
- Designing for Generalisability and Fairness:
- –
- Cross-Population Generalisation and Fairness: Ensuring fairness in prognostic AI extends beyond mitigating algorithmic bias during training. It also requires systematic evaluation of model generalisation across institutions, devices, and demographic groups. Performance audits should therefore be conducted across independent clinical sites and heterogeneous populations to identify and correct any degradation in accuracy or interpretability. Such cross-site validation not only strengthens the model’s robustness but also ensures equitable clinical benefit, preventing the amplification of existing disparities in access or outcomes.
- Interface Design for Probabilistic Information:
- –
- Challenge: Presenting a single, deterministic trajectory forecast can create a false sense of certainty and undermine clinical judgement.
- –
- Methodological Solution: The interface must be designed to communicate uncertainty effectively. The primary visualisation should not be a single line, but a “cone of probability,” showing the mean predicted trajectory surrounded by shaded confidence intervals (e.g., 50% and 95% prediction intervals). The interface should also allow for interactive exploration, enabling the user to view the counterfactual explanations (“What if we change the TIF to this other medication?”) and see how the probabilistic forecast shifts in response. This interactivity is not cosmetic; it transforms the forecast into a shared reasoning space where clinicians can simulate and discuss treatment alternatives transparently. Simplicity, clarity, and effective use of colour and layout are paramount to ensure the information is glanceable and interpretable in a busy clinical setting [79,80].
- Supporting the Therapeutic Alliance:
- –
- –
- Methodological Solution: The CDS tool should be designed explicitly to facilitate a collaborative conversation. It can be used as a shared visual aid during a consultation. A clinician could show a patient their own digital phenotype data (e.g., “You can see here how your sleep regularity has improved over the last two weeks”) and then connect it to the forecasted PRT (“The model suggests this improvement is a very positive sign for your long-term recovery”). This reframes the tool from a top-down predictor to a bottom-up facilitator of shared understanding and self-efficacy, leveraging the patient’s own data to empower them in their recovery process. It supports the therapeutic relationship by providing a common, data-driven ground for discussion and goal-setting.
- Illustrative Use Case: To illustrate the envisioned clinical application, consider a follow-up consultation in which a clinician uses the prognostic decision support interface with a patient. The model forecasts a flattening in the predicted recovery trajectory over the next four weeks and highlights disrupted sleep regularity and reduced mobility as primary contributing factors. Guided by this insight, the clinician and patient review recent behavioural data, discuss possible stressors, and agree on specific adjustments to improve daily routines and sleep hygiene. This interaction exemplifies how the proposed system can function as an interpretive, collaborative aid—supporting clinical reasoning and shared understanding rather than replacing human judgement.
5. Discussion
5.1. Summary of the Methodological Contribution
5.2. Limitations and Future Directions
5.3. Conclusion: A Roadmap for Prognostic Psychiatry
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| BPTT | Back Propagation Through Time |
| CDS | Clinical Decision Support |
| DTW | Dynamic Time Warping |
| EHRs | Electronic Health Records |
| FDA | Food and Drug Administration (US) |
| fMRI | Functional Magnetic Resonance Imaging |
| GDPR | General Data Protection Regulation (EU) |
| GPS | Global Positioning System |
| HAM-D | Hamilton Depression Rating Scale |
| HCD | Human-Centred Design |
| HCI | Human-Computer Interaction |
| HIPAA | Health Insurance Portability and Accountability Act (US) |
| HPA | Hypothalamic-Pituitary-Adrenal (axis) |
| ICD-10 | International Classification of Diseases, 10th Revision |
| LSTM | Long Short-Term Memory |
| MAE | Mean Absolute Error |
| MDD | Major Depressive Disorder |
| ORCID | Open Researcher and Contributor ID |
| PHQ-9 | Patient Health Questionnaire-9 |
| PROMs | Patient-Reported Outcome Measures |
| PRT | Predicted Recovery Trajectory |
| PSV | Patient State Vector |
| RDOC | Research Domain Criteria |
| RMSE | Root Mean Squared Error |
| RNN | Recurrent Neural Network |
| SHAP | Shapley Additive Explanations |
| SMS | Short Message Service |
| SSM | State-Space Model |
| SUDs | Substance Use Disorders |
| TIF | Therapeutic Impulse Function |
| WHO | World Health Organization |
| XAI | Explainable Artificial Intelligence |
References
- Abrahams, A.B.; Beckenstrom, A.; Browning, M.; Dias, R.; Goodwin, G.M.; Gorwood, P.; Kingslake, J.; Morriss, R.; Reif, A.; Ruhe, H.G.; et al. Exploring the incidence of inadequate response to antidepressants in the primary care of depression. Eur. Neuropsychopharmacol. 2024, 83, 61–70. [Google Scholar] [CrossRef]
- Penn, E.M.; Tracy, D.K. The drugs don’t work? Antidepressants and the current and future pharmacological management of depression. Ther. Adv. Psychopharmacol. 2012, 2, 179–188. [Google Scholar] [CrossRef]
- Alharbi, A. Treatment-resistant depression: Therapeutic trends, challenges, and future directions. Patient Prefer. Adherence 2012, 6, 369–388. [Google Scholar] [CrossRef]
- Zelek-Molik, A.; Litwa, E. Trends in research on novel antidepressant treatments. Front. Pharmacol. 2025, 16, 1544795. [Google Scholar] [CrossRef]
- Voineskos, D.; Daskalakis, Z.J.; Blumberger, D.M. Management of treatment-resistant depression: Challenges and strategies. Neuropsychiatr. Dis. Treat. 2020, 16, 221–234. [Google Scholar] [CrossRef] [PubMed]
- Rush, A.J.; Trivedi, M.H.; Wisniewski, S.R.; Nierenberg, A.A.; Stewart, J.W.; Warden, D.; Niederehe, G.; Thase, M.E.; Lavori, P.W.; Lebowitz, B.D.; et al. Acute and longer-term outcomes in depressed outpatients requiring one or several treatment steps: A STAR* D report. Am. J. Psychiatry 2006, 163, 1905–1917. [Google Scholar] [CrossRef]
- Crown, W.H.; Finkelstein, S.; Berndt, E.R.; Ling, D.; Poret, A.W.; Rush, A.J.; Russell, J.M. The impact of treatment-resistant depression on health care utilization and costs. J. Clin. Psychiatry 2002, 63, 963–971. [Google Scholar] [CrossRef] [PubMed]
- Lépine, J.P.; Briley, M. The increasing burden of depression. Neuropsychiatr. Dis. Treat. 2011, 7, 3. [Google Scholar] [CrossRef]
- Ozomaro, U.; Wahlestedt, C.; Nemeroff, C.B. Personalized medicine in psychiatry: Problems and promises. BMC Med. 2013, 11, 132. [Google Scholar] [CrossRef] [PubMed]
- Baminiwatta, A. Global trends of machine learning applications in psychiatric research over 30 years: A bibliometric analysis. Asian J. Psychiatry 2022, 69, 102986. [Google Scholar] [CrossRef]
- Iyortsuun, N.K.; Kim, S.; Jhon, M.; Yang, H.; Pant, S. A review of machine learning and deep learning approaches on mental health diagnosis. Healthcare 2023, 11, 285. [Google Scholar] [CrossRef]
- Sun, J.; Lu, T.; Shao, X.; Han, Y.; Xia, Y.; Zheng, Y.; Wang, Y.; Li, X.; Ravindran, A.; Fan, L.; et al. Practical AI application in psychiatry: Historical review and future directions. Mol. Psychiatry 2025, 30, 4399–4408. [Google Scholar] [CrossRef]
- Shatte, A.; Hutchinson, D.; Teague, S. Machine learning in mental health: A scoping review of methods and applications. Psychol. Med. 2019, 49, 1426–1448. [Google Scholar] [CrossRef]
- Karvelis, P.; Charlton, C.E.; Allohverdi, S.G.; Bedford, P.; Hauke, D.J.; Diaconescu, A.O. Computational approaches to treatment response prediction in major depression using brain activity and behavioral data: A systematic review. Netw. Neurosci. 2022, 6, 1066–1103. [Google Scholar] [CrossRef] [PubMed]
- Coley, R.Y.; Boggs, J.M.; Beck, A.; Simon, G.E. Predicting outcomes of psychotherapy for depression with electronic health record data. J. Affect. Disord. Rep. 2021, 6, 100198. [Google Scholar] [CrossRef] [PubMed]
- Simon, G.E.; Cruz, M.; Boggs, J.M.; Beck, A.; Shortreed, S.M.; Coley, R.Y. Predicting outcomes of antidepressant treatment in community practice settings. Psychiatr. Serv. 2024, 75, 419–426. [Google Scholar] [CrossRef]
- Elsaesser, M.; Feige, B.; Kriston, L.; Schumacher, L.; Peifer, J.; Hautzinger, M.; Härter, M.; Schramm, E. Longitudinal clusters of long-term trajectories in patients with early-onset chronic depression: 2 years of naturalistic follow-up after extensive psychological treatment. Psychother. Psychosom. 2023, 93, 65–74. [Google Scholar] [CrossRef]
- Choi, E.; Bahadori, M.T.; Schuetz, A.; Stewart, W.F.; Sun, J. Doctor AI: Predicting Clinical Events via Recurrent Neural Networks. arXiv 2016, arXiv:1511.05942. [Google Scholar] [CrossRef]
- Frässle, S.; Marquand, A.F.; Schmaal, L.; Dinga, R.; Veltman, D.J.; Wee, N.J.v.D.; Tol, M.v.; Schöbi, D.; Penninx, B.W.; Stephan, K.E. Predicting individual clinical trajectories of depression with generative embedding. NeuroImage Clin. 2020, 26, 102213. [Google Scholar] [CrossRef]
- Lai, W.; Liao, Y.; Zhang, H.; Zhao, H.; Li, Y.; Chen, R.; Shi, G.; Liu, Y.; Hao, J.; Li, Z.; et al. The trajectory of depressive symptoms and the association with quality of life and suicidal ideation in patients with major depressive disorder. BMC Psychiatry 2025, 25, 310. [Google Scholar] [CrossRef]
- Schmaal, L.; Marquand, A.F.; Rhebergen, D.; Tol, M.v.; Ruhé, H.G.; Wee, N.J.v.D.; Veltman, D.J.; Penninx, B.W. Predicting the naturalistic course of major depressive disorder using clinical and multimodal neuroimaging information: A multivariate pattern recognition study. Biol. Psychiatry 2015, 78, 278–286. [Google Scholar] [CrossRef]
- Stephan, K.E.; Bach, D.R.; Fletcher, P.C.; Flint, J.; Frank, M.J.; Friston, K.J.; Heinz, A.; Huys, Q.J.M.; Owen, M.J.; Binder, E.B.; et al. Charting the landscape of computational psychiatry. Lancet Psychiatry 2017, 4, 324–334. [Google Scholar] [CrossRef]
- Ngabo-Woods, H.; Dunai, L.; Verdú, I.S. A Prognostic Theory of Treatment Response for Major Depressive Disorder: A Dynamic Systems Framework for Forecasting Clinical Trajectories. Appl. Sci. 2025, 15, 12524. [Google Scholar] [CrossRef]
- Farrar, C.R.; Lieven, N.A.J. Damage prognosis: The future of structural health monitoring. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2007, 365, 623–632. [Google Scholar] [CrossRef]
- Hayes, A.M.; Andrews, L.A. A complex systems approach to the study of change in psychotherapy. BMC Med. 2020, 18, 197. [Google Scholar] [CrossRef] [PubMed]
- Durstewitz, D.; Huys, Q.J.M.; Koppe, G. Psychiatric illnesses as disorders of network dynamics. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 2021, 6, 865–876. [Google Scholar] [CrossRef] [PubMed]
- Sajjadian, M.; Lam, R.W.; Milev, R.; Rotzinger, S.; Frey, B.N.; Soares, C.N.; Parikh, S.V.; Foster, J.A.; Turecki, G.; Müller, D.J.; et al. Machine learning in the prediction of depression treatment outcomes: A systematic review and meta-analysis. Psychol. Med. 2021, 51, 2742–2751. [Google Scholar] [CrossRef]
- Curtiss, J.; DiPietro, C.P. Machine learning in the prediction of treatment response for emotional disorders: A systematic review and meta-analysis. Clin. Psychol. Rev. 2025, 120, 102593. [Google Scholar] [CrossRef]
- Ntam, V.A.; Huebner, T.; Steffens, M.; Scholl, C. Machine learning approaches in the therapeutic outcome prediction in major depressive disorder: A systematic review. Front. Psychiatry 2025, 16, 1588963. [Google Scholar] [CrossRef]
- Hiemke, C.; Baumann, P.; Bergemann, N.; Conca, A.; Dietmaier, O.; Egberts, K.; Fric, M.; Gerlach, M.; Greiner, C.; Gründer, G.; et al. AGNP consensus guidelines for therapeutic drug monitoring in psychiatry: Update 2011. Pharmacopsychiatry 2011, 44, 195–235. [Google Scholar] [CrossRef]
- Kang, S.; Cho, S. Neuroimaging biomarkers for predicting treatment response and recurrence of major depressive disorder. Int. J. Mol. Sci. 2020, 21, 2148. [Google Scholar] [CrossRef]
- Köhler-Forsberg, K.; Jorgensen, A.; Dam, V.H.; Stenbæk, D.S.; Fisher, P.M.; Ip, C.; Ganz, M.; Poulsen, H.E.; Giraldi, A.; Ozenne, B.; et al. Predicting treatment outcome in major depressive disorder using serotonin 4 receptor pet brain imaging, functional mri, cognitive-, eeg-based, and peripheral biomarkers: A neuropharm open label clinical trial protocol. Front. Psychiatry 2020, 11, 641. [Google Scholar] [CrossRef]
- Fonseka, T.M.; MacQueen, G.; Kennedy, S.H. Neuroimaging biomarkers as predictors of treatment outcome in major depressive disorder. J. Affect. Disord. 2018, 233, 21–35. [Google Scholar] [CrossRef]
- Li, Z.; McIntyre, R.S.; Husain, S.F.; Ho, R.; Tran, B.X.; Nguyen, H.T.; Soo, S.; Ho, C.S.H.; Chen, N. Identifying neuroimaging biomarkers of major depressive disorder from cortical hemodynamic responses using machine learning approaches. eBioMedicine 2022, 79, 104027. [Google Scholar] [CrossRef]
- Li, X.; Pei, C.; Wang, X.; Wang, H.; Tian, S.; Yao, Z.; Lü, Q. Predicting neuroimaging biomarkers for antidepressant selection in early treatment of depression. J. Magn. Reson. Imaging 2021, 54, 551–559. [Google Scholar] [CrossRef]
- Cai, H.; Song, H.; Yang, Y.; Xiao, Z.; Zhang, X.; Jiang, F.; Liu, H.; Tang, Y. Big-five personality traits and depression: Chain mediation of self-efficacy and walking. Front. Psychiatry 2024, 15, 1460888. [Google Scholar] [CrossRef]
- Watson, M.; Protzner, A.B.; McGirr, A. Five-factor personality and antidepressant response to intermittent theta burst stimulation for major depressive disorder. Transcranial Magn. Stimul. 2025, 5, 100196. [Google Scholar] [CrossRef]
- Chen, J.; Huang, H. The influence of big five personality traits on depression and suicidal behavior. In The Association Between Depression and Suicidal Behavior; IntechOpen: London, UK, 2024. [Google Scholar] [CrossRef]
- Choi, E.; Bahadori, M.T.; Kulas, J.A.; Schuetz, A.; Stewart, W.F.; Sun, J. RETAIN: An interpretable predictive model for healthcare using reverse time attention mechanism. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; Curran Associates Inc.: Red Hook, NY, USA, 2016. NIPS’16. pp. 3512–3520. [Google Scholar]
- Raballo, A. Digital phenotyping: An overarching framework to capture our extended mental states. Lancet Psychiatry 2018, 5, 194–195. [Google Scholar] [CrossRef] [PubMed]
- Torous, J.; Onnela, J.; Keshavan, M.S. New dimensions and new tools to realize the potential of rdoc: Digital phenotyping via smartphones and connected devices. Transl. Psychiatry 2017, 7, e1053. [Google Scholar] [CrossRef] [PubMed]
- Baytas, I.M.; Xiao, C.; Zhang, X.; Wang, F.; Jain, A.K.; Zhou, J. Patient Subtyping via Time-Aware LSTM Networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; KDD ’17. pp. 65–74. [Google Scholar] [CrossRef]
- Guo, Y.; Wen, T.; Yue, S.; Zhao, X.; Huang, K. The Influence of Health Information Attention and App Usage Frequency of Older Adults on Persuasive Strategies in mHealth Education Apps. Digit. Health 2023, 9, 20552076231167003. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Z.; Wang, Y.; Tan, S.; Xia, B.; Luo, Y. Enhancing transformer-based models for long sequence time series forecasting via structured matrix. Neurocomputing 2025, 625, 129429. [Google Scholar] [CrossRef]
- Liu, L.J.; Ortiz-Soriano, V.; Neyra, J.A.; Chen, J. Kit-lstm: Knowledge-guided time-aware lstm for continuous clinical risk prediction. In Proceedings of the 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Las Vegas, NV, USA, 6–8 December 2022; pp. 1086–1091. [Google Scholar] [CrossRef]
- Miotto, R.; Li, L.; Kidd, B.A.; Dudley, J.T. Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records. Sci. Rep. 2016, 6, 26094. [Google Scholar] [CrossRef]
- Lipton, Z.C.; Kale, D.C.; Elkan, C.; Wetzel, R. Learning to Diagnose with LSTM Recurrent Neural Networks. arXiv 2015, arXiv:1511.03677. [Google Scholar]
- Che, Z.; Purushotham, S.; Cho, K.; Sontag, D.; Liu, Y. Recurrent Neural Networks for Multivariate Time Series with Missing Values. Sci. Rep. 2018, 8, 6085. [Google Scholar] [CrossRef]
- Song, Z.; Lu, Q.; Zhu, H.; Buckeridge, D.; Li, Y. Bidirectional Generative Pre-training for Improving Healthcare Time-series Representation Learning. arXiv 2024, arXiv:2402.09558. [Google Scholar]
- Liu, R.; Hou, X.; Liu, S.; Zhou, Y.; Zhou, J.; Qiao, K.; Qi, H.; Li, R.; Yang, Z.; Zhang, L.; et al. Predicting antidepressant response via local-global graph neural network and neuroimaging biomarkers. npj Digit. Med. 2025, 8, 515. [Google Scholar] [CrossRef]
- Ntekouli, M.; Spanakis, G.; Waldorp, L.; Roefs, A. Exploiting Individual Graph Structures to Enhance Ecological Momentary Assessment (EMA) Forecasting. arXiv 2024, arXiv:2403.19442. [Google Scholar] [CrossRef]
- Lin, E.; Chen, C.H.; Chen, H.H. Computational approaches to treatment response prediction in major depressive disorder: A systematic review. Netw. Neurosci. 2021, 6, 1066–1090. [Google Scholar] [CrossRef]
- An, P.H. Exploring the Digital Healthcare Product’s Logistics and Mental Healthcare in the Metaverse: Role of Technology Anxiety and Metaverse Bandwidth Fluctuations. iRASD J. Manag. 2023, 5, 223–241. [Google Scholar] [CrossRef]
- Wen, Q.; Zhou, T.; Zhang, C.; Chen, W.; Ma, Z.; Yan, J.; Sun, L. Transformers in time series: A survey. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, Macao, China, 19–25 August 2023; IJCAI: Bremen, Germany, 2023; pp. 6778–6786. [Google Scholar] [CrossRef]
- Jiang, Y.; Ning, K.; Pan, Z.; Shen, X.; Ni, J.; Yu, W.; Schneider, A.; Chen, H.; Nevmyvaka, Y.; Song, D. Multi-modal Time Series Analysis: A Tutorial and Survey. arXiv 2025, arXiv:2503.13709. [Google Scholar] [CrossRef]
- Chang, C.; Hwang, J.; Shi, Y.; Wang, H.; Peng, W.C.; Chen, T.F.; Wang, W. Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series. arXiv 2025, arXiv:2506.10412. [Google Scholar] [CrossRef]
- Joyce, D.W.; Kormilitzin, A.; Smith, K.; Cipriani, A. Explainable artificial intelligence for mental health through transparency and interpretability for understandability. npj Digit. Med. 2023, 6, 6. [Google Scholar] [CrossRef]
- Probierz, B.; Straś, A.; Rodek, P.; Kozak, J. Explainable ai in psychiatry. In Explainable Artificial Intelligence for Sustainable Development; Routledge: London, UK, 2025; pp. 245–262. [Google Scholar] [CrossRef]
- Alowais, S.A.; Alghamdi, S.S.; Alsuhebany, N.; Alqahtani, T.; Alshaya, A.I.; Almoaiqel, M. Revolutionizing healthcare: The role of artificial intelligence in clinical practice. BMC Med. Educ. 2023, 23, 689. [Google Scholar] [CrossRef]
- Tilala, M.H.; Chenchala, P.K.; Choppadandi, A.; Kaur, J.; Naguri, S.; Saoji, R.; Devaguptapu, B. Ethical considerations in the use of artificial intelligence and machine learning in health care: A comprehensive review. Cureus 2024, 16, e62443. [Google Scholar] [CrossRef]
- Bauer, M.; Glenn, T.; Monteith, S.; Bauer, R.; Whybrow, P.C.; Geddes, J. Ethical perspectives on recommending digital technology for patients with mental illness. Int. J. Bipolar Disord. 2017, 5, 6. [Google Scholar] [CrossRef] [PubMed]
- Ratti, E.; Morrison, M.; Jakab, I. Ethical and social considerations of applying artificial intelligence in healthcare—A two-pronged scoping review. BMC Med. Ethics 2025, 26, 68. [Google Scholar] [CrossRef]
- Rieke, N.; Hancox, J.; Li, W.; Milletarì, F.; Roth, H.R.; Albarqouni, S.; Bakas, S.; Galtier, M.; Landman, B.A.; Maier-Hein, K.; et al. The future of digital health with federated learning. npj Digit. Med. 2020, 3, 119. [Google Scholar] [CrossRef] [PubMed]
- Oladimeji, O.; Oladimeji, T.; Abiodun, O. Artificial intelligence in mental health: A review of current trends and future directions. J. Ment. Health Clin. Psychol. 2023, 7, 65–72. [Google Scholar] [CrossRef]
- Avula, V.C.R.; Amalakanti, S. Artificial intelligence in psychiatry, present trends, and challenges: An updated review. Arch. Ment. Health 2023, 25, 85–90. [Google Scholar] [CrossRef]
- Alhuwaydi, A.M. Exploring the role of artificial intelligence in mental healthcare: Current trends and future directions—A narrative review for a comprehensive insight. Risk Manag. Healthc. Policy 2024, 17, 1339–1348. [Google Scholar] [CrossRef]
- Hoose, S.; Králiková, K. Artificial intelligence in mental health care: Management implications, ethical challenges, and policy considerations. Adm. Sci. 2024, 14, 227. [Google Scholar] [CrossRef]
- Cruz-Gonzalez, P.; He, A.; Lam, E.K.M.; Ng, I.A.T.; Li, M.; Hou, R.; Chan, J.N.; Sahni, Y.; Viñas-Guasch, N.; Miller, T.; et al. Artificial intelligence in mental health care: A systematic review of diagnosis, monitoring, and intervention applications. Psychol. Med. 2025, 55, e18. [Google Scholar] [CrossRef] [PubMed]
- Pfisterer, F. Algorithmic fairness. In Applied Machine Learning Using Mlr3 in R; Chapman and Hall/CRC: Boca Raton, FL, USA, 2023; pp. 316–324. [Google Scholar] [CrossRef]
- Summerton, N.; Cansdale, M. Artificial intelligence and diagnosis in general practice. Br. J. Gen. Pract. 2019, 69, 324–325. [Google Scholar] [CrossRef] [PubMed]
- Jones, C.; Thornton, J.; Wyatt, J.C. Artificial intelligence and clinical decision support: Clinicians’ perspectives on trust, trustworthiness, and liability. Med. Law Rev. 2023, 31, 501–520. [Google Scholar] [CrossRef]
- Wang, X.; Zhang, Y.; Zhu, R. A brief review on algorithmic fairness. Manag. Syst. Eng. 2022, 1, 7. [Google Scholar] [CrossRef]
- Morley, J.; Machado, C.C.V.; Burr, C.; Cowls, J.; Joshi, I.; Taddeo, M.; Floridi, L. The ethics of ai in health care: A mapping review. Soc. Sci. Med. 2020, 260, 113172. [Google Scholar] [CrossRef]
- Rajkomar, A.; Hardt, M.; Howell, M.; Corrado, G.S.; Chin, M.H. Ensuring fairness in machine learning to advance health equity. Ann. Intern. Med. 2018, 169, 866–872. [Google Scholar] [CrossRef]
- Reddy, S. Generative ai in healthcare: An implementation science informed translational path on application, integration and governance. Implement. Sci. 2024, 19, 27. [Google Scholar] [CrossRef]
- Mucci, F.; Marazziti, D. Artificial Intelligence in Neuropsychiatry: A Potential Beacon in an Ocean of Uncertainty? Clin. Neuropsychiatry 2023, 20, 467–471. [Google Scholar]
- Ray, A.; Bhardwaj, A.; Malik, Y.K.; Singh, S.; Gupta, R. Artificial intelligence and psychiatry: An overview. Asian J. Psychiatry 2022, 70, 103021. [Google Scholar] [CrossRef]
- World Health Organization. WHO Calls for Safe and Ethical AI for Health. 2023. Available online: https://www.who.int/news/item/16-05-2023-who-calls-for-safe-and-ethical-ai-for-health (accessed on 28 June 2025).
- Mishra, R.; Satpathy, R.; Pati, B. Human computer interaction applications in healthcare: An integrative review. EAI Endorsed Trans. Pervasive Health Technol. 2023, 9, 1–10. [Google Scholar] [CrossRef]
- Zhao, X.; Zhang, S.; Nan, D.; Han, J.; Kim, J.H. Human–computer interaction in healthcare: A bibliometric analysis with citespace. Healthcare 2024, 12, 2467. [Google Scholar] [CrossRef] [PubMed]
- Caetano, R.; Oliveira, J.M.; Ramos, P. Transformer-based models for probabilistic time series forecasting with explanatory variables. Mathematics 2025, 13, 814. [Google Scholar] [CrossRef]

| Domain | Sub-Domain | Data Source | Raw Metric |
| Clinical | Symptom - Severity | PHQ-9 Questionnaire | Sum of item scores |
| Comorbidity | Electronic Health Record | ICD-10 Codes | |
| Biological | Pharmacogene-tics | Genetic Assay | CYP2D6 genotype |
| Neuroendo-crinology | Saliva/Blood Sample | Cortisol concentration | |
| Digital Phenotype | Mobility | Smartphone GPS | Latitude/ Longitude |
| Smartphone GPS | Latitude/ Longitude | ||
| Social Activity | Smartphone Call Log | Call metadata | |
| Smartphone SMS Log | SMS metadata | ||
| Circadian Rhythms | Smartphone Screen | Screen on/off timestamps | |
| Wearable Accelerometer | 3-axis acceleration | ||
| Physical Activity | Wearable Accelerometer | Step detection | |
| Domain | Engineered Feature | Sampling Freq. | Theoretical Link to MDD |
| Clinical | Weekly PHQ-9 Score | Weekly | Core measure of depression severity |
| Binary flags for anxiety, SUDs | Static (Baseline) | Comorbidity impacts prognosis | |
| Biological | Poor/Intermediate/ Normal/Ultra-rapid metaboliser status | Static (Baseline) | Influences drug exposure/side effects |
| Baseline cortisol level | Static (Baseline) | HPA axis dysregulation | |
| Digital Phenotype | Location Entropy | Daily (from Hz data) | Anhedonia, avolition, behavioural withdrawal |
| Time Spent at Home | Daily (from Hz data) | Social withdrawal | |
| Number of unique contacts | Daily | Social network size/engagement | |
| Ratio of incoming / outgoing texts | Daily | Social reciprocity | |
| Inferred sleep duration | Daily | Sleep disturbance, insomnia/hypersomnia | |
| Circadian Regularity Index | Daily (from Hz data) | Disruption of daily routines | |
| Daily Step Count | Daily | Psychomotor retardation/agitation |
| Model | Input Data | Evaluation Metrics |
|---|---|---|
| Multiple Linear Regression (Baseline) | Baseline PSV only (static features) | Endpoint MAE, AUC at 12 weeks |
| ARIMA | Univariate time-series of clinical scores only | Trajectory RMSE, Endpoint MAE |
| Standard LSTM | Full time-series of PSV + TIF (no time-awareness) | Trajectory RMSE, Endpoint MAE, DTW |
| Time-Aware LSTM (Proposed) | Full time-series of PSV + TIF (with time-awareness) | Trajectory RMSE, Endpoint MAE, DTW |
| Implementation Phase | Key Deliverable | Primary Risk | Methodological Mitigation |
|---|---|---|---|
| Phase 1: Data Architecture | Constructed PSV & TIF vectors. | Sparsity & Noise: Sensors fail; patients skip surveys. | Impute missingness via time-decay gates; use “masking” layers in LSTM training. |
| Phase 2: Model Training | Trained Time-Aware LSTM. | Overfitting: Model memorises specific patient histories. | Nested Cross-Validation (patient-level split); Regularisation (Dropout). |
| Phase 3: Explainability (XAI) | SHAP plots & Attention Maps. | Misinterpretation: Clinicians over-rely on false signals. | Counterfactual testing (“What if?”); Uncertainty quantification (Confidence Intervals). |
| Phase 4: Clinical Integration | Decision Support Interface. | Alert Fatigue: Too many false alarms. | Set high specificity thresholds; co-design interface with clinicians (HCD). |
| Phase 5: Governance | Bias Audit Report. | Algorithmic Bias: Model fails for minority groups. | Mandatory performance disaggregation by demographic group; Federated Learning for privacy. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Ngabo-Woods, H.; Dunai, L.; Verdú, I.S.; Liang, S. A Multimodal Framework for Prognostic Modelling of Mental Health Treatment and Recovery Trajectories. Appl. Sci. 2026, 16, 763. https://doi.org/10.3390/app16020763
Ngabo-Woods H, Dunai L, Verdú IS, Liang S. A Multimodal Framework for Prognostic Modelling of Mental Health Treatment and Recovery Trajectories. Applied Sciences. 2026; 16(2):763. https://doi.org/10.3390/app16020763
Chicago/Turabian StyleNgabo-Woods, Harold, Larisa Dunai, Isabel Seguí Verdú, and Sui Liang. 2026. "A Multimodal Framework for Prognostic Modelling of Mental Health Treatment and Recovery Trajectories" Applied Sciences 16, no. 2: 763. https://doi.org/10.3390/app16020763
APA StyleNgabo-Woods, H., Dunai, L., Verdú, I. S., & Liang, S. (2026). A Multimodal Framework for Prognostic Modelling of Mental Health Treatment and Recovery Trajectories. Applied Sciences, 16(2), 763. https://doi.org/10.3390/app16020763

