Next Article in Journal
Optimal Shape Design of Cantilever Structure Thickness for Vibration Strain Distribution Maximization
Previous Article in Journal
Dual Perspectives: Safety Assessment of Legacy Pillars via Numerical Simulation and Artificial Intelligence Techniques
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Multimodal Framework for Prognostic Modelling of Mental Health Treatment and Recovery Trajectories

1
Universitat Politècnica de València, 46022 Valencia, Spain
2
Department Graphical Engineering, Universitat Politècnica de València, 46022 Valencia, Spain
*
Author to whom correspondence should be addressed.
Appl. Sci. 2026, 16(2), 763; https://doi.org/10.3390/app16020763
Submission received: 28 November 2025 / Revised: 29 December 2025 / Accepted: 7 January 2026 / Published: 12 January 2026

Abstract

The clinical management of major depressive disorder is constrained by a trial-and-error approach. The clinical management of major depressive disorder is constrained by a trial-and-error approach. While computational methods have focused on static binary classification (e.g., responder vs. non-responder), they ignore the dynamic nature of recovery. Building upon the recently proposed prognostic theory of treatment response, this article presents a methodological framework for its operationalisation. We define a multi-modal data architecture for the theory’s core constructs—the Patient State Vector (PSV), Therapeutic Impulse Function (TIF), and Predicted Recovery Trajectory (PRT)—transforming them from abstract concepts into specified computational inputs. To model the asynchronous interactions between these components, we specify a Time-Aware Long Short-Term Memory (LSTM) architecture, providing explicit mathematical formulations for time-decay gates to handle irregular clinical sampling. Furthermore, we outline a synthetic validation protocol to benchmark this dynamic approach against static baselines. By integrating these technical specifications with a translational pipeline for Explainable AI (XAI) and ethical governance, this paper provides the necessary blueprint to transition psychiatry from theoretical prognosis to empirical forecasting.

1. Introduction

1.1. The Prognostic Imperative in Psychiatry

The management of major depressive disorder (MDD) remains one of the most significant challenges in contemporary medicine. Despite a wide array of available treatments, the process of finding an effective intervention for an individual patient is often a protracted and painful exercise in trial and error [1,2,3,4,5]. Response rates to initial pharmacotherapy are modest, with fewer than half of patients experiencing a clinically meaningful response and only a third achieving full remission [6]. This variability underscores the limitations of current clinical decision-making approaches, which rely heavily on sequential empirical adjustments rather than predictive guidance [7,8]. Such formulations fail to capture the continuous and context-dependent nature of recovery, reducing a multidimensional temporal process to a static label.
For decades, the field of personalised psychiatry has pursued reliable biomarkers to guide treatment selection, yet this goal has remained elusive [9]. In response, a substantial body of research has applied machine learning and other computational techniques to predict treatment outcomes [5,10,11,12,13]. However, these efforts have had minimal impact on clinical practice, a translational failure that stems from the dominance of static prediction models and their limited integration into clinical workflows: the dominance of the static prediction paradigm [14]. The vast majority of current models are designed to classify a single, binary endpoint—such as “responder” versus “non-responder”—at a fixed future time point, based on a snapshot of baseline characteristics [15,16]. This approach is a profound oversimplification of clinical reality. Recovery is not a singular event but a dynamic, longitudinal process with significant intra-individual variability [17,18]. By collapsing this rich temporal information into a binary outcome, static models discard crucial data about the speed of response, patterns of symptom fluctuation, and the stability of improvement, all of which are vital for ongoing clinical management [19,20,21].
The critical knowledge gap, therefore, is not merely in improving the accuracy of static prediction, but in developing the capacity for dynamic, personalised prognosis—the ability to forecast the trajectory of a patient’s illness course under a specific treatment [22]. A clinician’s task is not just to guess an outcome at 12 weeks but to monitor and adjust treatment based on the patient’s progress over time. A truly useful clinical tool must align with this reality. It must forecast the path, not merely the endpoint. This shift from a static to a dynamic perspective represents the central computational and clinical imperative for the next generation of personalised psychiatry.

1.2. A Recap of the Prognostic Theory

To address this imperative, a prior work introduced a theoretical framework conceptualising mental health treatment response as a complex dynamic system [23]. This framework establishes an interdisciplinary paradigm by drawing a powerful analogy from the engineering field of structural health monitoring and its advanced application, damage prognosis [24]. While diagnosis in engineering answers the question, “What is the state of the system now?”, prognosis answers the far more valuable question, “Given the current state and future loads, how will the system behave over time?” [24]. The goal is to forecast the “remaining useful life” of a system, enabling predictive and adaptive interventions.
Applying this prognostic perspective to psychiatry, an individual patient is conceptualised as a complex system, with MDD representing an undesirable but stable “attractor state”. A therapeutic intervention is an active perturbation or “load” applied to this system with the goal of shifting it toward a stable state of recovery [25,26]. To formalise this, the theory introduced three core constructs [23]:
  • The Patient State Vector (PSV): A comprehensive, multi-modal, and time-varying representation of a patient’s state, analogous to the initial state assessment in engineering. It integrates clinical, biological, and high-frequency digital phenotype data to create a high-dimensional characterisation of the individual.
  • The Therapeutic Impulse Function (TIF): A formal characterisation of a treatment’s properties (e.g., pharmacodynamics, therapeutic modality), analogous to the “future loading conditions” in an engineering system. It defines the specific perturbation being applied to the patient’s system.
  • The Predicted Recovery Trajectory (PRT): The forecasted, continuous path of a patient’s symptom severity over time, analogous to the “remaining useful life” prediction. The theory’s central thesis is that the PRT is an emergent property of the dynamic interaction between an individual’s unique PSV and a specific TIF.
Together, these constructs provide a formal vocabulary for linking patient data, treatment properties, and temporal outcomes within a unified prognostic modelling framework.

1.3. Objective: From Theory to a Testable Methodology

The formulation of this prognostic theory was the necessary first step. However, a theory’s value is ultimately determined by its ability to be tested, refined, and translated into practice. This methodological contribution bridges that gap between theoretical formulation and practical implementation by presenting a structured methodological framework that operationalises the prognostic theory, making it empirically testable and directly applicable to computational and clinical research contexts.
Rather than reporting empirical findings, this work provides a replicable and adaptable methodology for researchers aiming to apply and validate the prognostic theory. It responds to persistent methodological barriers in computational psychiatry—heterogeneous data standards, limited sample sizes, and insufficient validation protocols—that have constrained translational progress [27,28,29]. By establishing a standardised, transparent, and ethically grounded process, this framework aims to serve as a shared foundation for reproducible research and for the design of human-centred clinical support tools.
The paper is structured as follows: Section 2 provides a granular guide to operationalising the core constructs, detailing the data architecture for the PSV, TIF, and PRT. Section 3 proposes and justifies a specific deep learning architecture—a time-aware Long Short-Term Memory (LSTM) network—for forecasting the PRT and outlines a rigorous protocol for its training and evaluation. Section 4 outlines the translational pathway, including explainability, ethical governance, and principles of human-computer interaction for clinical integration. Section 5 discusses the implications and limitations of the framework, positioning it as a scalable foundation for predictive, personalised, and human-centred psychiatry.
This framework situates the development of prognostic AI within a human-centred design paradigm, emphasising interpretability, ethical responsibility, and the co-evolution of computational and clinical reasoning.

1.4. Scope and Contributions

This manuscript presents a methodological framework—a theoretical and architectural blueprint—rather than the results of an empirical trial. Given the current lack of standardised, multi-modal datasets combining high-frequency digital phenotyping with biological markers, a rigorous “design-first” approach is required before large-scale data collection can commence.
The specific contributions of this work are:
  • A Novel Taxonomy for Prognosis: The operationalisation of the “Patient State Vector” (PSV) and “Therapeutic Impulse Function” (TIF) provides a formal vocabulary for modelling the dynamic interaction between patient and treatment, distinct from static baseline predictors.
  • A Time-Aware Architecture Specification: We specify a deep learning architecture explicitly designed for the irregularity of real-world clinical data, addressing the “asynchrony” problem that renders standard RNNs ineffective in practice.
  • A Translational Roadmap: Unlike purely technical papers, we provide an integrated implementation strategy that treats Explainable AI (XAI) and ethical governance as architectural requirements, not optional add-ons.
Limitations of Scope: This paper does not report results from a clinical cohort. Instead, it defines the protocol and computational specifications required to generate such results. Validation strategies, including synthetic data simulation, are outlined in Section 3.3 to guide future empirical work.

2. From Theory to Data: Operationalising the Prognostic Constructs

A prognostic model is only as powerful as the data it learns from. Building on the theoretical constructs introduced earlier—the Patient State Vector (PSV), Therapeutic Impulse Function (TIF), and Predicted Recovery Trajectory (PRT)—this section describes how these components are operationalised as multi-modal data structures suitable for computational modelling and empirical validation. This section provides a detailed blueprint for this operationalisation, specifying the data sources, feature engineering pipelines, and encoding strategies that form the inputs and outputs of the proposed prognostic model. The emphasis is placed on creating a data architecture that is both technically robust and compatible with clinical and ethical standards for real-world deployment.

2.1. The Patient State Vector (PSV): A Multi-Modal, High-Frequency Data Architecture

The PSV represents a comprehensive, temporally dynamic description of a patient’s state [23]. Its construction requires the integration of heterogeneous data streams, each with different temporal resolutions and characteristics. A key innovation of the PSV is its departure from a simple, flat baseline feature vector. Unlike traditional baseline feature sets, the PSV is designed as a chronologically aware, multi-scale representation of an evolving system. It combines static features (e.g., genetics), low-frequency time-series (e.g., weekly clinical assessments), and high-frequency time-series (e.g., continuous sensor data). This nested temporal structure introduces a significant methodological challenge: the model must not only fuse heterogeneous data types but also capture the complex cross-scale dependencies that occur over time. Addressing this requires data fusion strategies that preserve temporal alignment and contextual relationships, going beyond simple feature concatenation. The following subsections outline the four key data domains and the process through which they are integrated into the PSV architecture.

2.1.1. Clinical Data

This domain includes the information typically collected in clinical trials and practice. While often considered “baseline” data, a dynamic prognostic approach requires capturing these measures longitudinally.
  • Data Sources: Electronic Health Records (EHRs), clinical interviews, and Patient-Reported Outcome Measures (PROMs).
  • Metrics:
    Symptom Severity: Standardised scales such as the Patient Health Questionnaire-9 (PHQ-9) and the Hamilton Depression Rating Scale (HAM-D), collected at regular intervals (e.g., weekly) to track the evolution of symptoms.
    Diagnostic History: Coded diagnoses (e.g., ICD-10) for MDD, comorbidities (e.g., anxiety disorders, substance use disorders), and history of prior treatment attempts.
    Demographics: Age, gender, socioeconomic status, and other relevant demographic variables.
  • Feature Engineering: Raw scores from clinical scales are used directly. Diagnostic history can be one-hot encoded. Longitudinal scores form a low-frequency time-series input to the model. Data harmonisation and standardisation protocols are essential to ensure comparability across sources and enable reproducible research.

2.1.2. Biological Data

This domain captures the neurobiological and physiological substrates that may influence or moderate treatment response. These data sources tend to be sparse, expensive to obtain, and typically static or low-frequency in nature, yet they provide essential context for individual variability.
  • Data Sources: Genetic assays, neuroimaging scans, and blood or saliva samples.
  • Metrics:
    Pharmacogenetics: Genotypes for key genes influencing drug metabolism (e.g., cytochrome P450 enzymes like CYP2D6, CYP2C19) or drug targets [30].
    Neuroimaging: Measures derived from resting-state functional MRI (fMRI), such as functional connectivity within and between key networks (e.g., default mode network, salience network), or structural measures from T1-weighted MRI [31,32,33,34,35].
    Peripheral Biomarkers: Levels of inflammatory markers (e.g., high-sensitivity C-reactive protein), stress hormones (e.g., cortisol), and neurotrophic factors [36,37,38].
  • Feature Engineering: Genetic polymorphisms are encoded as categorical variables; neuroimaging and biomarker data are represented as numerical features. Because these data are static or extremely low-frequency, they function as conditioning variables that modulate the model’s dynamic predictions rather than forming a temporal sequence. Clear documentation of acquisition protocols and feature preprocessing is necessary to ensure replicability and comparability across datasets [39].

2.1.3. Digital Phenotype Data

This domain is a cornerstone of the PSV, providing a continuous, objective, and ecologically valid view of real-world behaviour [40]. It involves transforming high-frequency sensor data from personal devices into clinically meaningful behavioural indicators [40,41]. From an HCI perspective, these data also reflect the interaction between individuals and their digital environments, making them central to human-centred prognostic design.
  • Data Sources: Passively collected data from sensors embedded in smartphones and wearables (e.g., GPS, accelerometer, gyroscope, screen usage, call/text logs) [40,41].
  • Feature Engineering Pipeline: Raw sensor data is typically aggregated into daily or hourly summaries to align with the temporal scale of mood fluctuations. Major behavioural domains and representative features include:
    Mobility Patterns: Raw GPS coordinates are processed to derive features reflecting behavioural activation and social withdrawal. These include:
    • Location Variance: The statistical variance of latitude and longitude, capturing the geographic spread of a person’s movement.
    • Entropy: A measure of the diversity and predictability of visited locations. Lower entropy suggests a more restricted and repetitive routine.
    • Time Spent at Home: The proportion of time a user’s device is located at their inferred home location.
    • Distance Travelled: Total daily distance covered.
    Social Activity: Call and SMS text message logs (metadata only, not content) are used to quantify social engagement. Features include:
    • Number and Duration of Incoming/Outgoing Calls: A proxy for active social interaction.
    • Number of Unique Contacts: A measure of social network size.
    • Text Message Frequency: A proxy for passive social communication.
    Circadian Rhythms: The regularity of daily routines, particularly sleep-wake cycles, is a critical indicator of mental health. These patterns can be inferred from multiple sensors:
    • Phone Lock/Unlock Patterns: The longest continuous period of no screen interaction during nighttime hours can serve as a proxy for sleep duration [40,41].
    • Accelerometer Data: Periods of prolonged inactivity from a wearable can more directly measure sleep.
    • Circadian Regularity: A metric quantifying the stability of these patterns from day to day.
    Physical Activity: Accelerometer data is used to derive standard activity metrics like daily step count and time spent in different activity intensity levels (e.g., sedentary, light, moderate).
Digital phenotype features are high-frequency inputs that provide temporal granularity and behavioural context. When integrated with clinical and biological data, they enhance both model accuracy and interpretability, particularly within user-facing systems that visualise behavioural trajectories for clinical feedback.
Smartphone-based sensing is subject to sampling variability across devices and operating systems. GPS sampling may fluctuate depending on device movement, power-saving modes, or environmental conditions. Accelerometer sampling rates may degrade under battery optimisation settings or background execution limits. These sources of variability require normalisation procedures to ensure consistent daily feature extraction.
Noise and artefact removal is performed prior to aggregation: implausible GPS jumps are removed using speed thresholds, stationary accelerometer noise is filtered using standard deviation-based heuristics, and missingness arising from battery depletion or user behaviour is explicitly encoded. These corrections ensure that the temporal patterns captured in the PSV reflect behavioural signals rather than hardware or OS artefacts.
Although this framework focuses primarily on smartphone-based sensing, it is fully compatible with wearable accelerometers and physiological sensors such as heart rate variability (HRV) or electrodermal activity (EDA). These additional streams can be integrated into the PSV as higher-frequency inputs for enhanced temporal resolution.
Sensor Data Acquisition Pipeline
The digital phenotype features used in this framework originate from continuous streams collected through smartphones and wearables. GPS coordinates are typically sampled at 1 Hz, accelerometer data at 50–100 Hz depending on device settings, and screen interaction events are captured as discrete timestamps. All raw timestamps are first converted into a unified time base (UTC), and device-specific clock drift is corrected when available.
To enable integration within the PSV, raw high-frequency streams are aggregated into daily summaries through a preprocessing pipeline consisting of: (1) timestamp alignment, (2) noise and artefact removal (e.g., invalid GPS points, accelerometer spikes), (3) interpolation for short gaps, and (4) extraction of behavioural indicators such as location entropy, inferred sleep duration, circadian regularity, and mobility radius.
Longer gaps due to battery loss, sensor deactivation, or device shutdown are handled by explicit missingness markers, ensuring that the time-aware LSTM can model irregular sampling intervals [42]. This pipeline provides a structured and reproducible method for converting heterogeneous sensor data into the dynamic features of the PSV.

2.1.4. Data Integration and Handling

The PSV constitutes an inherently multi-modal, multi-rate, and partially asynchronous time-series dataset, posing challenges in data fusion, missingness, and temporal alignment [43,44]. A reproducible methodological framework must address these challenges through well-defined preprocessing and harmonisation procedures.
  • Aggregation: High-frequency digital phenotype data is aggregated to a consistent temporal resolution (e.g., daily summaries).
  • Imputation: For sparse data streams like weekly clinical scores or occasional biomarkers, appropriate imputation or interpolation techniques (e.g., mean imputation, forward-fill, or more sophisticated model-based imputation) must be applied.
  • Asynchrony: The asynchronous nature of data collection across modalities is a core temporal modelling challenge. This can be mitigated at both the preprocessing and modelling stages. Architecturally, time-aware mechanisms, such as the gated structures described in Section 3, can explicitly represent time intervals between observations and learn from irregular sampling patterns [45,46,47].
To support implementation, Table 1 provides a structured data dictionary that defines the variables, sources, and temporal properties comprising the PSV.
Before modelling outcomes, the framework must also formalise the treatment input itself. Complementing the PSV, the Therapeutic Impulse Function (TIF) represents each intervention in a structured, quantitative way that allows the model to simulate how different treatments influence patient trajectories.

2.2. The Therapeutic Impulse Function (TIF): A Formalism for Treatment Inputs

Before modelling outcomes, the framework must also formalise the treatment input itself. Complementing the PSV, the Therapeutic Impulse Function (TIF) captures the structured characteristics of an intervention, enabling quantitative modelling of how different treatments influence recovery trajectories. To test the theory’s proposition that treatment outcomes depend on a specific PSV-TIF interaction, the treatment itself must be represented in a standardised and parameterised format [23]. Simply using a treatment’s brand name as a categorical label discards crucial information about its underlying properties. The TIF represents each intervention as a feature vector describing its pharmacological or procedural properties, transforming treatment selection from a heuristic process into a data-driven problem of system control and optimisation. This reframing allows for a more mechanistic analysis, moving from the question “Which drug is best?” to “For a system in state PSV, what TIF vector will most efficiently guide the system trajectory towards remission?”
  • Pharmacotherapy TIF: A vector representation for a medication would include:
    Mechanism of Action: A multi-hot encoded vector representing the primary neurotransmitter systems targeted (e.g., -> for an SNRI).
    Half-life Elimination : A numerical feature for the drug’s elimination half-life (in hours), which governs dosing frequency and time to steady state [30].
    Metabolism: A categorical feature representing the primary CYP450 enzyme responsible for its metabolism (e.g., CYP2D6), allowing for direct modelling of gene-drug interactions.
    Dose: A normalised numerical feature representing the prescribed daily dosage relative to the standard therapeutic range.
  • Psychotherapy TIF: A vector representation for a psychotherapy would include:
    Modality: A one-hot encoded vector for the primary therapeutic approach (e.g., Cognitive Behavioural Therapy, Psychodynamic Therapy, Interpersonal Therapy) [25].
    Dose and Schedule: Numerical features for session duration (in minutes) and frequency (sessions per week).
    Process Variables (if available): If data from session ratings is available, a numerical feature for the therapeutic alliance score could be included as a time-varying component of the TIF.
This structured TIF serves as a static or dynamic input to the prognostic model, allowing it to learn not only that treatments differ in outcome, but how and why they do so, based on their intrinsic properties. This representation also facilitates interpretability and integration with decision-support interfaces, where individual treatment parameters can be adjusted and their forecasted effects visualised.

2.3. The Predicted Recovery Trajectory (PRT): Defining the Forecasting Target

The PRT is the primary output and evaluation target of the prognostic framework. Unlike the single-endpoint predictions of static models, the PRT is operationalised as a multi-step sequential forecast representing the expected evolution of symptom severity.
Definition: The PRT is a vector of predicted symptom severity scores (e.g., weekly PHQ-9 scores) over a clinically relevant future horizon, such as 12 weeks. For a model making predictions at week 0, the target output would be [PHQ9 week1, PHQ9 week2, PHQ9 weekX, PHQ9 week12].
Probabilistic Forecasting: A clinically useful forecast must also communicate uncertainty and confidence intervals. Accordingly, the model should output not only a point estimate at each time step but a predictive distribution. This can be achieved through techniques like Monte Carlo dropout or by training the model to predict the parameters (mean and variance) of a Gaussian distribution for each time step. This allows for the visualisation of prediction intervals, giving clinicians a sense of the likely range of future outcomes.
Figure 1 summarises the flow from PSV and TIF to the PRT.
The figure illustrates the flow of information through the proposed methodological system. Multi-modal data streams—including clinical, biological, and digital phenotype domains—are fused into the Patient State Vector (PSV), a dynamic representation of individual state. Treatment characteristics are formalised as the Therapeutic Impulse Function (TIF). These two components jointly serve as inputs to a time-aware Long Short-Term Memory (LSTM) architecture, which forecasts the Predicted Recovery Trajectory (PRT). The PRT output is then embedded within a translational pipeline encompassing interpretability (Explainable AI), ethical governance, and human-centred interface design for clinical decision support. Together, these layers form an integrated framework for developing and deploying prognostic AI in mental health.

3. A Dynamic Forecasting Architecture for the Predicted Recovery Trajectory

With the data structures for the model’s inputs (PSV, TIF) and output (PRT) defined, this section specifies the computational engine of the prognostic framework. Consistent with the theoretical emphasis on temporal dependency and dynamic change, the choice of model architecture is not merely a technical detail; it is a theoretical commitment. A model that cannot process time-series data or capture path-dependent dynamics is fundamentally incompatible with the prognostic theory. This section justifies the selection of a Recurrent Neural Network (RNN) architecture, specifically a time-aware Long Short-Term Memory (LSTM) network with optional attention mechanisms, as the most appropriate tool for this task [48].

3.1. Model Selection Rationale: From State-Space Models to Recurrent Neural Networks

The challenge of modelling psychological and physiological time-series has a long history. A powerful and conceptually elegant approach is the state-space model (SSM) [49,50]. SSMs formalise a system by separating a latent (unobserved) state, which evolves over time according to some underlying dynamic, from the observed measurements, which are manifestations of that latent state. This formalism aligns perfectly with the prognostic theory: the “true” depressive state of a patient is a latent construct, and the various components of the PSV (symptom scores, mobility patterns, etc.) are its noisy, observable indicators. However, classical SSMs often assume linear dynamics and Gaussian noise, which may be overly restrictive for capturing the complex, non-linear processes of psychiatric recovery.
Recurrent Neural Networks (RNNs) can be understood as a powerful, non-linear generalisation of this state-space concept [48,51]. An RNN processes a sequence of inputs one step at a time, maintaining an internal “hidden state” vector that acts as a form of memory. At each time step, the network updates its hidden state based on the current input and its previous hidden state. This hidden state is, in effect, a high-dimensional, learned latent representation of the system’s history, analogous to the state vector in an SSM [39,46,48]. The key advantage is that the complex, non-linear functions governing the state transitions and the mapping from state to observation are learned automatically from the data by the neural network, rather than being pre-specified.
Within the family of RNNs, the Long Short-Term Memory (LSTM) architecture is particularly well-suited for clinical data [18,28,29,48,52,53]. Standard RNNs struggle with the “vanishing gradient” problem, making it difficult for them to learn long-range temporal dependencies. LSTMs overcome this limitation through a series of “gates” (input, forget, and output gates) that explicitly control the flow of information into, out of, and within the cell’s memory. This gating mechanism allows the LSTM to selectively remember important information from the distant past while discarding irrelevant details, a crucial capability for modelling the weeks-long or months-long process of treatment response. The choice of an LSTM, therefore, is not just a technical preference but a computational operationalisation of the theoretical assumption that recovery is path-dependent. Unlike a simple regression model that assumes the future is a direct function of the baseline state, an LSTM’s predictions are conditioned on the entire history of the system as encoded in its hidden state. This structure inherently embodies the theory’s proposition that how a patient arrives at their current state is as important as the state itself [23].
This approach stands in contrast to traditional time-series models like ARIMA, which, while useful for univariate forecasting, are ill-equipped to handle the multi-modal inputs of the PSV and struggle to capture the complex, non-linear dynamics inherent in biological systems. Moreover, the LSTM’s ability to learn temporal dependencies from heterogeneous data makes it particularly suitable for integration into human-centred decision-support systems that require interpretable, time-evolving outputs.
While Transformer-based architectures (e.g., Self-Attention mechanisms) have achieved state-of-the-art results in many time-series domains, we explicitly propose a Time-Aware LSTM for this specific clinical application for two reasons [42]. First, data sparsity: clinical datasets, particularly those with deep phenotyping, rarely reach the sample sizes required to train Transformers without significant overfitting. LSTMs generally demonstrate better inductive bias for smaller, sequence-dependent datasets common in psychiatry [39,54,55,56]. Second, explicit temporal modelling: standard Transformers rely on positional encodings which can struggle with the highly irregular, continuous-time nature of patient visits (e.g., gaps of 2 days vs. 2 months). The Time-Aware LSTM’s gate mechanism (Equation (1)) offers a more direct and biologically plausible method for modelling the decay of information over irregular intervals. Future iterations of this framework may explore “Time-Aware Transformers” as data availability increases [42,46,47].

3.2. Proposed Architecture: A Multi-Input, Time-Aware LSTM for Prognosis

To operationalise the prognostic framework, a multi-input, stacked, time-aware LSTM architecture is proposed. This architecture is specifically designed to address the unique challenges of the PSV and TIF data structures.
Mathematical Formalism of Inputs:
Let the clinical trajectory of a patient be represented by a sequence of observations. We define the model inputs as:
  • Dynamic Inputs ( x t ): A sequence of time-varying vectors X = ( x 1 , x 2 , , x T ) , where x t R d represents the Dynamic PSV features (e.g., symptom scores, aggregated digital phenotype metrics) at time step t.
  • Static Inputs (s): A time-invariant vector s R k representing the Static PSV (genetics, demographics) and the Therapeutic Impulse Function (TIF).
  • Time Gaps ( Δ t ): To account for irregularity, we explicitly calculate the elapsed time between consecutive observations as Δ t t = t c u r t p r e v , where t represents the absolute timestamp.
Handling Multi-Modal Inputs:
The model features separate input pathways to accommodate these data types.
  • Dynamic Pathway: The sequence x t is fed into the first LSTM layer to encode temporal dynamics: h t ( 1 ) = LSTM 1 x t , h t 1 ( 1 ) .
  • Static Pathway: The static vector s is concatenated with the hidden state output of the first layer (⊕ denotes concatenation) to condition the high-level feature learning: u t = h t ( 1 ) s . This fused vector u t serves as the input to the second LSTM layer: h t ( 2 ) = LSTM 2 ( u t , h t 1 ( 2 ) ) .
Time-Awareness:
A critical limitation of standard LSTMs is that they treat observations as if they occur at regular time intervals, ignoring the actual time gaps Δ t t . To overcome this, the proposed architecture incorporates explicit time-decay gates into the LSTM cell, inspired by the KIT-LSTM model [45].
Standard LSTMs update the cell memory c t using an input gate i t , forget gate f t , and candidate memory C ˜ t . We modify this by introducing a time-decay gate g t that is a monotonic decreasing function of the time gap:
g t = σ 1 log ( e + Δ t t )
This gate acts on the previous cell memory c t 1 to discount information based on the elapsed time, creating an adjusted memory state c t 1 * before the standard LSTM update occurs:
c t 1 * = c t 1 g t
where ⊙ denotes element-wise multiplication. This mechanism allows the model to “forget” more of its memory as the time gap Δ t t increases, ensuring that a symptom score from yesterday is weighted more heavily than one from two weeks ago. This step performs a temporal correction on the historical memory, allowing recent clinical states to retain a stronger influence while attenuating information originating from older or irregularly spaced observations.
Using this time-adjusted memory, the final LSTM cell-state update follows the standard formulation, with the forget gate f t input gate i t , and candidate memory C ˜ t :
c t = f t c t 1 * + i t c ˜ t .
Together, these two equations (Equations (2) and (3)) constitute the complete Time-Aware LSTM memory mechanism. The first equation embeds sensitivity to temporal irregularity, while the second integrates the decayed historical information with new clinical input. This sequential formulation preserves the interpretability and gating structure of the classical LSTM while enabling principled modelling of PSV trajectories with inconsistent sampling intervals.
Attention Mechanism (Extension):
For enhanced performance and interpretability, the architecture can be extended with a temporal attention mechanism. The model computes a context vector z as a weighted sum of past hidden states h i :
z = i = 1 T α i h i
where α i are attention weights derived from a learned compatibility function. This allows the model to dynamically focus on specific historical periods (e.g., weeks 2–4) when forecasting the PRT, directly addressing the theory’s proposition about the importance of early response dynamics [23].
Interpretability and Clinical Integration:
To align with the human-centred orientation of this work, the modular structure allows for transparent analysis. The attention weights α i provide a direct measure of temporal saliency, indicating which time points most influenced a specific prognosis. Similarly, gradient-based attribution methods can be applied to the static input vector s to quantify the contribution of specific TIF parameters (e.g., drug dosage) or biological markers to the predicted trajectory.

3.3. Training, Validation, and Evaluation Protocol

A rigorous and transparent protocol for model development and evaluation is essential for ensuring the reliability and replicability of the results. The focus of this protocol must shift from optimising for the best endpoint classification accuracy to achieving the most clinically informative and accurate trajectory forecast.
  • Data Preprocessing: All continuous input features from the PSV and TIF will be standardised (e.g., z-score normalisation) to ensure they are on a comparable scale, a standard practice for training neural networks.
  • Training: The model will be trained end-to-end using backpropagation through time (BPTT). The Adam optimiser, an adaptive learning rate optimisation algorithm, will be used to minimise the loss function. The primary loss function will be the Root Mean Squared Error (RMSE) calculated across all time points of the predicted and true trajectories.
  • Validation: To prevent information leakage and ensure the model generalises to unseen patients, a patient-level, nested cross-validation scheme will be employed. The outer loop splits patients into training and test sets. The inner loop performs hyperparameter tuning (e.g., number of LSTM units, learning rate, dropout rate) on a validation set carved out from the training set. This approach ensures patient-level independence and provides a robust estimate of model generalisability.
  • Evaluation Metrics: The evaluation must reflect the prognostic goal of forecasting the entire trajectory. While traditional classification metrics can be reported for benchmarking, the primary metrics should be:
    Trajectory-wise Root Mean Squared Error (RMSE) or Mean Absolute Error (MAE): The average error calculated across all forecasted time points in the PRT. This is the primary measure of the model’s ability to accurately predict the entire path of recovery.
    Endpoint MAE: The absolute error at the final time point of the forecast horizon (e.g., week 12). This allows for direct comparison with static models that only predict this single point.
    Dynamic Time Warping (DTW): A metric that measures the similarity between two temporal sequences that may vary in speed. It is particularly useful for assessing whether the model has captured the correct shape of the recovery trajectory (e.g., rapid initial response followed by a plateau), even if it is slightly off in its timing.
To demonstrate the value of this dynamic, multi-modal approach, the proposed model must be benchmarked against a series of simpler models. Table 2 outlines this evaluation framework.
This structured comparison will allow for a clear quantification of the performance gains attributable to (a) using a dynamic model over a static one, (b) incorporating multi-modal data over univariate data, and (c) explicitly modelling irregular time intervals. This presents a rigorous, evidence-based argument in support of the proposed methodology.
In addition to quantitative performance, qualitative inspection of trajectory shapes and attention maps will provide evidence for clinical plausibility and interpretability—key criteria for human-centred AI in psychiatry.

Proposed Synthetic Validation Protocol

To validate the architectural advantages of the Time-Aware LSTM prior to clinical deployment, we propose a synthetic data experiment designed to isolate the effects of temporal irregularity.
  • Data Generation: We generate synthetic patient trajectories y ( t ) using a damped harmonic oscillator equation with injected noise, representing the cyclical but decaying nature of mood episodes.
  • Irregular Sampling Simulation: We sample this ground-truth trajectory at random intervals Δ t Exponential ( λ ) , creating a “sparse” observation set typical of clinical data.
  • Hypothesis Testing: We test the hypothesis that standard LSTMs will fail to capture the recovery rate because they treat the random Δ t as constant steps. The Time-Aware LSTM, using the gate g t = σ 1 log e + Δ t , is expected to recover the underlying decay parameter despite the irregular sampling.
This protocol allows for the verification of the “Time-Aware” mechanism (Equations (1)–(3)) independent of clinical noise, serving as a necessary gate before training on expensive real-world data.

4. Bridging the Translational Gap: A Blueprint for Responsible Implementation

A prognostic model, no matter how accurate, is clinically inert if it is untrusted, unusable, or unethical. The history of AI in medicine is littered with technically successful models that failed to translate into practice because these crucial human-centred factors were ignored [23]. Therefore, a complete methodological framework must not end with model validation; it must extend to include a proactive blueprint for responsible implementation. The deployment of a prognostic model creates a new form of socio-technical-ethical debt. The act of collecting the highly sensitive PSV incurs a debt to patient privacy; the act of generating a high-stakes forecast incurs a debt of accountability. These interdependent debts highlight that model performance and ethical integrity are inseparable dimensions of translational success. The ultimate success metric for this framework is not simply predictive accuracy, but clinical adoption and stakeholder trust, which are fundamentally outcomes of good design, ethical integrity, and clear communication, as highlighted by the phases in Table 3.

4.1. Ensuring Clinical Trust: From Black Box to Interpretable Prognosis (XAI)

The “black box” nature of deep learning models is one of the most significant barriers to their clinical adoption [23,57,58]. For a clinician to act on a model’s prediction—especially one that may contradict their own judgement—they must have a plausible and understandable rationale for that prediction [23]. This requires integrating Explainable AI (XAI) techniques into the core methodology. Interpretability is not a single feature but a multi-level requirement that must be tailored to the needs of different stakeholders: developers need technical diagnostics, clinicians need actionable clinical narratives, and patients need empowering insights. In this context, interpretability serves not only epistemic transparency but also epistemic responsibility, the obligation to make predictions that can be justified and communicated within clinical reasoning frameworks.
  • Feature Importance (Global and Local):
    Technique: Methods like SHAP (Shapley Additive Explanations) can be used to quantify the contribution of each input feature to a specific prediction.
    Clinical Application: After the model forecasts a PRT of non-response for a patient, SHAP can reveal the primary drivers of this prediction. For example, it might highlight that persistent sleep disruption (from the digital phenotype) and high baseline inflammatory markers are the top contributing factors. This transforms an opaque prediction (“non-responder”) into an interpretable clinical narrative (“The model predicts non-response, likely due to unresolved sleep and inflammation issues”), providing the clinician with a testable hypothesis to guide their next steps.
  • Temporal Saliency:
    Technique: If an attention mechanism is used, the attention weights themselves can be visualised to show which past time points the model focused on when making its forecast. Alternatively, gradient-based saliency methods can be used.
    Clinical Application: This can identify critical periods in a patient’s illness course. For instance, the model might learn that a small dip in mood and social activity during week 3, even if it recovers, is a strong predictor of eventual relapse. This provides clinicians with insight into the importance of early, subtle fluctuations that might otherwise be dismissed. Such outputs can be visualised directly in the interface, allowing clinicians to inspect which temporal dynamics contributed most to a forecast—enhancing both trust and educational value.
  • Counterfactual Explanations:
    Technique: These methods generate “what-if” scenarios by minimally perturbing the input to change the model’s output.
    Clinical Application: This provides the most actionable form of explanation. The system could answer questions like, “How would the predicted trajectory change if this patient’s daily step count increased by 2000?” or “What is the minimum improvement in sleep regularity needed to shift the forecast from non-response to response?” This bridges model output with behavioural and clinical levers of change, aligning algorithmic insight with therapeutic reasoning. This directly connects the model’s prediction to modifiable behaviours and helps in collaborative treatment planning between the clinician and patient.

4.2. An Ethical Framework for Prognostic AI

The collection and use of data for the PSV, particularly the continuous and intimate data from digital phenotyping, raises profound ethical challenges that must be addressed proactively [23,59]. A robust ethical framework is not an optional add-on but a core component of the methodology. Ethics here is treated as a continuous design principle, not as a compliance checklist.
  • Data Privacy and Informed Consent:
    Challenge: The passive, continuous nature of digital phenotyping is incompatible with traditional, one-time consent models [40,41]. Patients must have an ongoing understanding and control over what data is being collected and how it is being used [60,61,62].
    Methodological Solution: Implementation of a dynamic consent interface within the data collection application. This would allow patients to view the data being collected, understand its purpose in simple terms, and have granular control to pause or withdraw specific data streams at any time. Furthermore, privacy-preserving machine learning techniques, such as federated learning, should be explored. In this approach, the model is trained on the user’s device, and only the updated model weights, not the raw personal data, are sent to a central server, significantly enhancing privacy [63]. Complementary approaches, such as differential privacy and secure multiparty computation, could further minimise re-identification risks in aggregated datasets.
  • Algorithmic Bias and Fairness:
    Challenge: AI models trained on historical data can learn and amplify existing societal biases and health disparities [23,64,65,66,67,68]. A model trained predominantly on one demographic group may perform poorly and unfairly on underrepresented populations, exacerbating inequities in care [69,70,71,72].
    Methodological Solution:A mandatory algorithmic bias audit must be part of the model validation protocol. This involves disaggregating model performance metrics (e.g., Trajectory RMSE) across key demographic subgroups (e.g., race, gender, age, socioeconomic status) [73,74]. If significant performance disparities are found, mitigation strategies must be employed, such as re-weighting the training data, applying fairness constraints during training, or collecting more data from the underperforming subgroup [73,74]. Results from these audits should be reported transparently in Supplementary Materials, establishing accountability and reproducibility.
  • Accountability and Responsibility:
    Challenge: If the model’s forecast contributes to an adverse patient outcome, who is responsible? The developer, the hospital, or the clinician who used the tool? [75].
    Methodological Solution: The framework must establish clear lines of accountability. The model must be legally and ethically framed as a Clinical Decision Support (CDS) tool, not a medical device that makes autonomous decisions. The final clinical judgement and responsibility must always reside with the human clinician. The system’s documentation and user interface must explicitly state its probabilistic nature, its limitations, and its role as an assistive tool to augment, not replace, professional expertise. Explicit audit logs should track how model outputs are used in clinical decisions, supporting traceability and learning from errors [75,76,77].
  • Regulatory Compliance:
    Challenge: The legal and regulatory landscape for AI in healthcare is rapidly evolving, with new guidelines from bodies like the WHO and FDA, and new laws at the state and national levels [73,74,78].
    Methodological Solution: The development process must incorporate a regulatory monitoring step, ensuring that data handling practices comply with existing regulations like HIPAA and GDPR, and that the tool’s classification (e.g., as a wellness app vs. a medical device) is appropriate and defended [59,73]. A risk-based framework should be adopted, where the level of regulatory scrutiny is proportional to the potential impact of the tool’s output on patient care [64,65,66,67,68].
This anticipatory governance approach ensures the model evolves in compliance with emerging ethical and legal norms rather than reacting post hoc.
Additional considerations for sensing-based prognostic systems: Within the context of mobile and wearable sensing, ethical considerations must also address the continuous and often unobtrusive collection of behavioural signals, including location traces and device-level identifiers. Key challenges involve managing real-time consent, deciding between local versus cloud-based processing, ensuring secure on-device storage, and applying differential privacy mechanisms to long-term behavioural streams. These issues differ from traditional clinical ethics and are essential for the responsible deployment of any prognostic framework that relies on sensor-derived data.

4.3. Human-Centred Design for Clinical Integration (HCI)

Even a perfectly accurate, interpretable, and ethical model will fail if it is not designed to fit seamlessly and usefully into the complex realities of clinical workflow [79,80]. The interface through which clinicians and patients interact with the prognostic forecasts is a critical component of the overall system. A human-centred design (HCD) process is therefore a foundational part of the methodology. By embedding usability testing and participatory design cycles throughout model development, the system’s interface becomes a co-evolving element of the methodology itself.
  • User-Centred Design Process:
    Challenge: CDS tools are often designed by engineers with little understanding of clinical realities, leading to poor workflow integration and low adoption rates [75].
    Methodological Solution: The design process must be iterative and participatory, involving clinicians, patients, and administrators from the earliest stages. Techniques like semi-structured interviews, workflow analysis (journey mapping), and the development of user personas can be used to deeply understand the needs, goals, and pain points of the end-users [75]. This ensures the final tool solves a real problem in a way that minimises, rather than increases, the clinician’s cognitive load. Iterative prototyping and usability metrics (e.g., task completion time, NASA-TLX workload) should be systematically recorded to ensure continuous alignment with user needs.
  • Designing for Generalisability and Fairness:
    Cross-Population Generalisation and Fairness: Ensuring fairness in prognostic AI extends beyond mitigating algorithmic bias during training. It also requires systematic evaluation of model generalisation across institutions, devices, and demographic groups. Performance audits should therefore be conducted across independent clinical sites and heterogeneous populations to identify and correct any degradation in accuracy or interpretability. Such cross-site validation not only strengthens the model’s robustness but also ensures equitable clinical benefit, preventing the amplification of existing disparities in access or outcomes.
  • Interface Design for Probabilistic Information:
    Challenge: Presenting a single, deterministic trajectory forecast can create a false sense of certainty and undermine clinical judgement.
    Methodological Solution: The interface must be designed to communicate uncertainty effectively. The primary visualisation should not be a single line, but a “cone of probability,” showing the mean predicted trajectory surrounded by shaded confidence intervals (e.g., 50% and 95% prediction intervals). The interface should also allow for interactive exploration, enabling the user to view the counterfactual explanations (“What if we change the TIF to this other medication?”) and see how the probabilistic forecast shifts in response. This interactivity is not cosmetic; it transforms the forecast into a shared reasoning space where clinicians can simulate and discuss treatment alternatives transparently. Simplicity, clarity, and effective use of colour and layout are paramount to ensure the information is glanceable and interpretable in a busy clinical setting [79,80].
  • Supporting the Therapeutic Alliance:
    Challenge: There is a significant risk that an AI tool could be perceived as an impersonal, algorithmic authority, thereby disempowering the patient and eroding the human connection at the heart of therapy [59,75].
    Methodological Solution: The CDS tool should be designed explicitly to facilitate a collaborative conversation. It can be used as a shared visual aid during a consultation. A clinician could show a patient their own digital phenotype data (e.g., “You can see here how your sleep regularity has improved over the last two weeks”) and then connect it to the forecasted PRT (“The model suggests this improvement is a very positive sign for your long-term recovery”). This reframes the tool from a top-down predictor to a bottom-up facilitator of shared understanding and self-efficacy, leveraging the patient’s own data to empower them in their recovery process. It supports the therapeutic relationship by providing a common, data-driven ground for discussion and goal-setting.
  • Illustrative Use Case: To illustrate the envisioned clinical application, consider a follow-up consultation in which a clinician uses the prognostic decision support interface with a patient. The model forecasts a flattening in the predicted recovery trajectory over the next four weeks and highlights disrupted sleep regularity and reduced mobility as primary contributing factors. Guided by this insight, the clinician and patient review recent behavioural data, discuss possible stressors, and agree on specific adjustments to improve daily routines and sleep hygiene. This interaction exemplifies how the proposed system can function as an interpretive, collaborative aid—supporting clinical reasoning and shared understanding rather than replacing human judgement.
Ultimately, the tool’s success should be evaluated not only by predictive metrics but by its measurable contribution to communication quality, patient engagement, and therapeutic alliance strength.

5. Discussion

5.1. Summary of the Methodological Contribution

This methodological contribution has presented a comprehensive, end-to-end methodological framework for operationalising a prognostic theory of mental health treatment response. Its purpose is to provide the necessary conceptual and technical blueprint to move the field from the prevailing paradigm of static, endpoint-based prediction toward a more clinically relevant paradigm of dynamic, trajectory-based prognosis. By uniting theoretical constructs, computational architectures, and human-centred design principles, this framework articulates a coherent research agenda for prognostic psychiatry. The key contributions of this framework are threefold.
First, it provides a granular and practical guide for operationalising the core theoretical constructs of the prognostic theory. It details a multi-modal, high-frequency data architecture for the Patient State Vector (PSV), a formal vector representation for the Therapeutic Impulse Function (TIF), and defines the Predicted Recovery Trajectory (PRT) as a suitable, multi-step forecasting target. This moves the theory from an abstract concept to an empirically testable reality.
Second, it proposes and justifies a theoretically grounded computational architecture for forecasting. The selection of a time-aware Long Short-Term Memory (LSTM) network is not arbitrary but is rooted in the dynamic, path-dependent nature of the recovery process as conceptualised by the theory. The architecture is specifically designed to handle the multi-modal, asynchronous, and irregularly-sampled data that characterises real-world clinical research. It thus translates theoretical principles of temporal dependency and individual variability into a computational form that can be empirically interrogated.
Third, and most critically, it argues that a viable methodology cannot be purely technical. It integrates a holistic translational pipeline as a core, non-negotiable component. This includes a protocol for ensuring model interpretability via XAI, a governance framework for navigating the profound ethical challenges of prognostic AI, and a set of human-centred design principles for creating a clinical decision support tool that is usable, trustworthy, and supportive of the therapeutic alliance. In positioning interpretability, ethics, and usability as methodological pillars rather than afterthoughts, the framework embodies the interdisciplinary ethos required for responsible AI in mental health and proactively addresses the primary reasons why promising computational models so often fail to impact patient care.

5.2. Limitations and Future Directions

Despite its comprehensive nature, this proposed framework has significant limitations that point toward important avenues for future research. The most formidable challenge is the immense data requirement. Assembling the full PSV, with its integration of clinical, biological, and high-frequency digital phenotype data, is a logistically complex and resource-intensive undertaking. Few, if any, existing datasets contain all of these components collected longitudinally in a large cohort. This limitation underscores the call made in the original theoretical paper for the establishment of large-scale, openly shared, multi-modal longitudinal cohorts, which are an essential prerequisite for training and validating robust prognostic models [18,23]. Future collaborations between clinical consortia and digital health platforms may provide the infrastructure necessary to meet this challenge.
On the computational front, while LSTMs are well-suited for capturing temporal dependencies, they can be computationally expensive to train and may struggle to capture very long-range dependencies spanning many months. Future work should explore more recent and potentially more powerful sequence-modelling architectures, such as Transformers, which have shown state-of-the-art performance in other time-series domains [43,44,81]. The attention mechanism at the core of the Transformer architecture could be particularly adept at identifying complex, non-linear interactions across a patient’s entire history. Moreover, hybrid architectures combining LSTMs and Transformers may balance interpretability with scalability, offering a pragmatic bridge toward clinical deployment.
Furthermore, the framework presented here focuses on a single-disease model (MDD). The next evolution of this work will involve extending the methodology to transdiagnostic applications, modelling the dynamic interplay between different symptom domains (e.g., depression, anxiety, sleep) and forecasting trajectories across multiple dimensions of mental health simultaneously. Such extensions could pave the way toward a unified computational taxonomy of psychiatric prognosis, aligning with emerging dimensional frameworks like the Research Domain Criteria (RDoC).
A critical next step for this line of research is the empirical validation of the proposed framework in real-world clinical contexts. Future studies should adopt a staged evaluation strategy, beginning with retrospective validation using existing longitudinal datasets, followed by prospective observational studies, and ultimately clinician-in-the-loop trials. Such a phased approach would enable the assessment of predictive performance, interpretability, and usability in parallel, ensuring that methodological advances translate into genuine clinical value. Moreover, deployment studies within clinical decision support environments will be essential to evaluate the framework’s capacity to augment clinician reasoning, adapt to diverse workflows, and sustain ethical and transparent operation under real-world constraints.

5.3. Conclusion: A Roadmap for Prognostic Psychiatry

This methodological contribution has laid out a methodological roadmap. It is a bridge designed to connect the abstract potential of a new theory to the reality of empirical validation and, ultimately, clinical application. The trial-and-error nature of psychiatric care is not an intractable problem but a scientific challenge that demands new tools and new ways of thinking. The shift from static prediction to dynamic prognosis is a necessary evolution for the field. The framework proposed here represents a step toward this transformation by defining how data, computation, and human factors can converge into a coherent system of prognostic reasoning.
By adopting a comprehensive, rigorous, and ethically-aware methodology—one that integrates sophisticated data architectures, theoretically-grounded models, and a robust translational pipeline- the research community can begin to build and validate the tools needed to realise the vision of a truly personalised and prognostic psychiatry. The impact of this framework should ultimately be measured not by technical novelty alone, but by its capacity to inform real-world clinical decision-making, enhance patient engagement, and improve treatment outcomes. This framework is not an endpoint, but a starting point for the empirical work ahead, a call to action to build the data infrastructure and computational systems that will finally allow the field to ensure that every patient has the best chance of receiving the right treatment, at the right time, on the right trajectory toward recovery.

Author Contributions

Conceptualization, H.N.-W.; methodology, H.N.-W., L.D., I.S.V. and S.L.; validation, H.N.-W., L.D. and I.S.V.; formal analysis, H.N.-W., L.D., I.S.V. and S.L.; investigation, H.N.-W., L.D. and I.S.V.; resources, H.N.-W., L.D. and I.S.V.; writing—original draft preparation, H.N.-W., L.D., I.S.V. and S.L.; writing—review and editing, H.N.-W., L.D., I.S.V. and S.L.; supervision, L.D. and I.S.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors acknowledge the partial use of generative AI and/or LLM tools (Paperpal (https://paperpal.com/, accessed on 6 January 2026) by Editage) for copy-editing to improve language and readability in accordance with emerging best practices in academic publishing. The authors are fully responsible for the content, accuracy, and integrity of this study.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIArtificial Intelligence
BPTTBack Propagation Through Time
CDSClinical Decision Support
DTWDynamic Time Warping
EHRsElectronic Health Records
FDAFood and Drug Administration (US)
fMRIFunctional Magnetic Resonance Imaging
GDPRGeneral Data Protection Regulation (EU)
GPSGlobal Positioning System
HAM-DHamilton Depression Rating Scale
HCDHuman-Centred Design
HCIHuman-Computer Interaction
HIPAAHealth Insurance Portability and Accountability Act (US)
HPAHypothalamic-Pituitary-Adrenal (axis)
ICD-10International Classification of Diseases, 10th Revision
LSTMLong Short-Term Memory
MAEMean Absolute Error
MDDMajor Depressive Disorder
ORCIDOpen Researcher and Contributor ID
PHQ-9Patient Health Questionnaire-9
PROMsPatient-Reported Outcome Measures
PRTPredicted Recovery Trajectory
PSV        Patient State Vector
RDOCResearch Domain Criteria
RMSERoot Mean Squared Error
RNNRecurrent Neural Network
SHAPShapley Additive Explanations
SMSShort Message Service
SSMState-Space Model
SUDsSubstance Use Disorders
TIFTherapeutic Impulse Function
WHOWorld Health Organization
XAIExplainable Artificial Intelligence

References

  1. Abrahams, A.B.; Beckenstrom, A.; Browning, M.; Dias, R.; Goodwin, G.M.; Gorwood, P.; Kingslake, J.; Morriss, R.; Reif, A.; Ruhe, H.G.; et al. Exploring the incidence of inadequate response to antidepressants in the primary care of depression. Eur. Neuropsychopharmacol. 2024, 83, 61–70. [Google Scholar] [CrossRef]
  2. Penn, E.M.; Tracy, D.K. The drugs don’t work? Antidepressants and the current and future pharmacological management of depression. Ther. Adv. Psychopharmacol. 2012, 2, 179–188. [Google Scholar] [CrossRef]
  3. Alharbi, A. Treatment-resistant depression: Therapeutic trends, challenges, and future directions. Patient Prefer. Adherence 2012, 6, 369–388. [Google Scholar] [CrossRef]
  4. Zelek-Molik, A.; Litwa, E. Trends in research on novel antidepressant treatments. Front. Pharmacol. 2025, 16, 1544795. [Google Scholar] [CrossRef]
  5. Voineskos, D.; Daskalakis, Z.J.; Blumberger, D.M. Management of treatment-resistant depression: Challenges and strategies. Neuropsychiatr. Dis. Treat. 2020, 16, 221–234. [Google Scholar] [CrossRef] [PubMed]
  6. Rush, A.J.; Trivedi, M.H.; Wisniewski, S.R.; Nierenberg, A.A.; Stewart, J.W.; Warden, D.; Niederehe, G.; Thase, M.E.; Lavori, P.W.; Lebowitz, B.D.; et al. Acute and longer-term outcomes in depressed outpatients requiring one or several treatment steps: A STAR* D report. Am. J. Psychiatry 2006, 163, 1905–1917. [Google Scholar] [CrossRef]
  7. Crown, W.H.; Finkelstein, S.; Berndt, E.R.; Ling, D.; Poret, A.W.; Rush, A.J.; Russell, J.M. The impact of treatment-resistant depression on health care utilization and costs. J. Clin. Psychiatry 2002, 63, 963–971. [Google Scholar] [CrossRef] [PubMed]
  8. Lépine, J.P.; Briley, M. The increasing burden of depression. Neuropsychiatr. Dis. Treat. 2011, 7, 3. [Google Scholar] [CrossRef]
  9. Ozomaro, U.; Wahlestedt, C.; Nemeroff, C.B. Personalized medicine in psychiatry: Problems and promises. BMC Med. 2013, 11, 132. [Google Scholar] [CrossRef] [PubMed]
  10. Baminiwatta, A. Global trends of machine learning applications in psychiatric research over 30 years: A bibliometric analysis. Asian J. Psychiatry 2022, 69, 102986. [Google Scholar] [CrossRef]
  11. Iyortsuun, N.K.; Kim, S.; Jhon, M.; Yang, H.; Pant, S. A review of machine learning and deep learning approaches on mental health diagnosis. Healthcare 2023, 11, 285. [Google Scholar] [CrossRef]
  12. Sun, J.; Lu, T.; Shao, X.; Han, Y.; Xia, Y.; Zheng, Y.; Wang, Y.; Li, X.; Ravindran, A.; Fan, L.; et al. Practical AI application in psychiatry: Historical review and future directions. Mol. Psychiatry 2025, 30, 4399–4408. [Google Scholar] [CrossRef]
  13. Shatte, A.; Hutchinson, D.; Teague, S. Machine learning in mental health: A scoping review of methods and applications. Psychol. Med. 2019, 49, 1426–1448. [Google Scholar] [CrossRef]
  14. Karvelis, P.; Charlton, C.E.; Allohverdi, S.G.; Bedford, P.; Hauke, D.J.; Diaconescu, A.O. Computational approaches to treatment response prediction in major depression using brain activity and behavioral data: A systematic review. Netw. Neurosci. 2022, 6, 1066–1103. [Google Scholar] [CrossRef] [PubMed]
  15. Coley, R.Y.; Boggs, J.M.; Beck, A.; Simon, G.E. Predicting outcomes of psychotherapy for depression with electronic health record data. J. Affect. Disord. Rep. 2021, 6, 100198. [Google Scholar] [CrossRef] [PubMed]
  16. Simon, G.E.; Cruz, M.; Boggs, J.M.; Beck, A.; Shortreed, S.M.; Coley, R.Y. Predicting outcomes of antidepressant treatment in community practice settings. Psychiatr. Serv. 2024, 75, 419–426. [Google Scholar] [CrossRef]
  17. Elsaesser, M.; Feige, B.; Kriston, L.; Schumacher, L.; Peifer, J.; Hautzinger, M.; Härter, M.; Schramm, E. Longitudinal clusters of long-term trajectories in patients with early-onset chronic depression: 2 years of naturalistic follow-up after extensive psychological treatment. Psychother. Psychosom. 2023, 93, 65–74. [Google Scholar] [CrossRef]
  18. Choi, E.; Bahadori, M.T.; Schuetz, A.; Stewart, W.F.; Sun, J. Doctor AI: Predicting Clinical Events via Recurrent Neural Networks. arXiv 2016, arXiv:1511.05942. [Google Scholar] [CrossRef]
  19. Frässle, S.; Marquand, A.F.; Schmaal, L.; Dinga, R.; Veltman, D.J.; Wee, N.J.v.D.; Tol, M.v.; Schöbi, D.; Penninx, B.W.; Stephan, K.E. Predicting individual clinical trajectories of depression with generative embedding. NeuroImage Clin. 2020, 26, 102213. [Google Scholar] [CrossRef]
  20. Lai, W.; Liao, Y.; Zhang, H.; Zhao, H.; Li, Y.; Chen, R.; Shi, G.; Liu, Y.; Hao, J.; Li, Z.; et al. The trajectory of depressive symptoms and the association with quality of life and suicidal ideation in patients with major depressive disorder. BMC Psychiatry 2025, 25, 310. [Google Scholar] [CrossRef]
  21. Schmaal, L.; Marquand, A.F.; Rhebergen, D.; Tol, M.v.; Ruhé, H.G.; Wee, N.J.v.D.; Veltman, D.J.; Penninx, B.W. Predicting the naturalistic course of major depressive disorder using clinical and multimodal neuroimaging information: A multivariate pattern recognition study. Biol. Psychiatry 2015, 78, 278–286. [Google Scholar] [CrossRef]
  22. Stephan, K.E.; Bach, D.R.; Fletcher, P.C.; Flint, J.; Frank, M.J.; Friston, K.J.; Heinz, A.; Huys, Q.J.M.; Owen, M.J.; Binder, E.B.; et al. Charting the landscape of computational psychiatry. Lancet Psychiatry 2017, 4, 324–334. [Google Scholar] [CrossRef]
  23. Ngabo-Woods, H.; Dunai, L.; Verdú, I.S. A Prognostic Theory of Treatment Response for Major Depressive Disorder: A Dynamic Systems Framework for Forecasting Clinical Trajectories. Appl. Sci. 2025, 15, 12524. [Google Scholar] [CrossRef]
  24. Farrar, C.R.; Lieven, N.A.J. Damage prognosis: The future of structural health monitoring. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2007, 365, 623–632. [Google Scholar] [CrossRef]
  25. Hayes, A.M.; Andrews, L.A. A complex systems approach to the study of change in psychotherapy. BMC Med. 2020, 18, 197. [Google Scholar] [CrossRef] [PubMed]
  26. Durstewitz, D.; Huys, Q.J.M.; Koppe, G. Psychiatric illnesses as disorders of network dynamics. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 2021, 6, 865–876. [Google Scholar] [CrossRef] [PubMed]
  27. Sajjadian, M.; Lam, R.W.; Milev, R.; Rotzinger, S.; Frey, B.N.; Soares, C.N.; Parikh, S.V.; Foster, J.A.; Turecki, G.; Müller, D.J.; et al. Machine learning in the prediction of depression treatment outcomes: A systematic review and meta-analysis. Psychol. Med. 2021, 51, 2742–2751. [Google Scholar] [CrossRef]
  28. Curtiss, J.; DiPietro, C.P. Machine learning in the prediction of treatment response for emotional disorders: A systematic review and meta-analysis. Clin. Psychol. Rev. 2025, 120, 102593. [Google Scholar] [CrossRef]
  29. Ntam, V.A.; Huebner, T.; Steffens, M.; Scholl, C. Machine learning approaches in the therapeutic outcome prediction in major depressive disorder: A systematic review. Front. Psychiatry 2025, 16, 1588963. [Google Scholar] [CrossRef]
  30. Hiemke, C.; Baumann, P.; Bergemann, N.; Conca, A.; Dietmaier, O.; Egberts, K.; Fric, M.; Gerlach, M.; Greiner, C.; Gründer, G.; et al. AGNP consensus guidelines for therapeutic drug monitoring in psychiatry: Update 2011. Pharmacopsychiatry 2011, 44, 195–235. [Google Scholar] [CrossRef]
  31. Kang, S.; Cho, S. Neuroimaging biomarkers for predicting treatment response and recurrence of major depressive disorder. Int. J. Mol. Sci. 2020, 21, 2148. [Google Scholar] [CrossRef]
  32. Köhler-Forsberg, K.; Jorgensen, A.; Dam, V.H.; Stenbæk, D.S.; Fisher, P.M.; Ip, C.; Ganz, M.; Poulsen, H.E.; Giraldi, A.; Ozenne, B.; et al. Predicting treatment outcome in major depressive disorder using serotonin 4 receptor pet brain imaging, functional mri, cognitive-, eeg-based, and peripheral biomarkers: A neuropharm open label clinical trial protocol. Front. Psychiatry 2020, 11, 641. [Google Scholar] [CrossRef]
  33. Fonseka, T.M.; MacQueen, G.; Kennedy, S.H. Neuroimaging biomarkers as predictors of treatment outcome in major depressive disorder. J. Affect. Disord. 2018, 233, 21–35. [Google Scholar] [CrossRef]
  34. Li, Z.; McIntyre, R.S.; Husain, S.F.; Ho, R.; Tran, B.X.; Nguyen, H.T.; Soo, S.; Ho, C.S.H.; Chen, N. Identifying neuroimaging biomarkers of major depressive disorder from cortical hemodynamic responses using machine learning approaches. eBioMedicine 2022, 79, 104027. [Google Scholar] [CrossRef]
  35. Li, X.; Pei, C.; Wang, X.; Wang, H.; Tian, S.; Yao, Z.; Lü, Q. Predicting neuroimaging biomarkers for antidepressant selection in early treatment of depression. J. Magn. Reson. Imaging 2021, 54, 551–559. [Google Scholar] [CrossRef]
  36. Cai, H.; Song, H.; Yang, Y.; Xiao, Z.; Zhang, X.; Jiang, F.; Liu, H.; Tang, Y. Big-five personality traits and depression: Chain mediation of self-efficacy and walking. Front. Psychiatry 2024, 15, 1460888. [Google Scholar] [CrossRef]
  37. Watson, M.; Protzner, A.B.; McGirr, A. Five-factor personality and antidepressant response to intermittent theta burst stimulation for major depressive disorder. Transcranial Magn. Stimul. 2025, 5, 100196. [Google Scholar] [CrossRef]
  38. Chen, J.; Huang, H. The influence of big five personality traits on depression and suicidal behavior. In The Association Between Depression and Suicidal Behavior; IntechOpen: London, UK, 2024. [Google Scholar] [CrossRef]
  39. Choi, E.; Bahadori, M.T.; Kulas, J.A.; Schuetz, A.; Stewart, W.F.; Sun, J. RETAIN: An interpretable predictive model for healthcare using reverse time attention mechanism. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; Curran Associates Inc.: Red Hook, NY, USA, 2016. NIPS’16. pp. 3512–3520. [Google Scholar]
  40. Raballo, A. Digital phenotyping: An overarching framework to capture our extended mental states. Lancet Psychiatry 2018, 5, 194–195. [Google Scholar] [CrossRef] [PubMed]
  41. Torous, J.; Onnela, J.; Keshavan, M.S. New dimensions and new tools to realize the potential of rdoc: Digital phenotyping via smartphones and connected devices. Transl. Psychiatry 2017, 7, e1053. [Google Scholar] [CrossRef] [PubMed]
  42. Baytas, I.M.; Xiao, C.; Zhang, X.; Wang, F.; Jain, A.K.; Zhou, J. Patient Subtyping via Time-Aware LSTM Networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; KDD ’17. pp. 65–74. [Google Scholar] [CrossRef]
  43. Guo, Y.; Wen, T.; Yue, S.; Zhao, X.; Huang, K. The Influence of Health Information Attention and App Usage Frequency of Older Adults on Persuasive Strategies in mHealth Education Apps. Digit. Health 2023, 9, 20552076231167003. [Google Scholar] [CrossRef] [PubMed]
  44. Zhang, Z.; Wang, Y.; Tan, S.; Xia, B.; Luo, Y. Enhancing transformer-based models for long sequence time series forecasting via structured matrix. Neurocomputing 2025, 625, 129429. [Google Scholar] [CrossRef]
  45. Liu, L.J.; Ortiz-Soriano, V.; Neyra, J.A.; Chen, J. Kit-lstm: Knowledge-guided time-aware lstm for continuous clinical risk prediction. In Proceedings of the 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Las Vegas, NV, USA, 6–8 December 2022; pp. 1086–1091. [Google Scholar] [CrossRef]
  46. Miotto, R.; Li, L.; Kidd, B.A.; Dudley, J.T. Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records. Sci. Rep. 2016, 6, 26094. [Google Scholar] [CrossRef]
  47. Lipton, Z.C.; Kale, D.C.; Elkan, C.; Wetzel, R. Learning to Diagnose with LSTM Recurrent Neural Networks. arXiv 2015, arXiv:1511.03677. [Google Scholar]
  48. Che, Z.; Purushotham, S.; Cho, K.; Sontag, D.; Liu, Y. Recurrent Neural Networks for Multivariate Time Series with Missing Values. Sci. Rep. 2018, 8, 6085. [Google Scholar] [CrossRef]
  49. Song, Z.; Lu, Q.; Zhu, H.; Buckeridge, D.; Li, Y. Bidirectional Generative Pre-training for Improving Healthcare Time-series Representation Learning. arXiv 2024, arXiv:2402.09558. [Google Scholar]
  50. Liu, R.; Hou, X.; Liu, S.; Zhou, Y.; Zhou, J.; Qiao, K.; Qi, H.; Li, R.; Yang, Z.; Zhang, L.; et al. Predicting antidepressant response via local-global graph neural network and neuroimaging biomarkers. npj Digit. Med. 2025, 8, 515. [Google Scholar] [CrossRef]
  51. Ntekouli, M.; Spanakis, G.; Waldorp, L.; Roefs, A. Exploiting Individual Graph Structures to Enhance Ecological Momentary Assessment (EMA) Forecasting. arXiv 2024, arXiv:2403.19442. [Google Scholar] [CrossRef]
  52. Lin, E.; Chen, C.H.; Chen, H.H. Computational approaches to treatment response prediction in major depressive disorder: A systematic review. Netw. Neurosci. 2021, 6, 1066–1090. [Google Scholar] [CrossRef]
  53. An, P.H. Exploring the Digital Healthcare Product’s Logistics and Mental Healthcare in the Metaverse: Role of Technology Anxiety and Metaverse Bandwidth Fluctuations. iRASD J. Manag. 2023, 5, 223–241. [Google Scholar] [CrossRef]
  54. Wen, Q.; Zhou, T.; Zhang, C.; Chen, W.; Ma, Z.; Yan, J.; Sun, L. Transformers in time series: A survey. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, Macao, China, 19–25 August 2023; IJCAI: Bremen, Germany, 2023; pp. 6778–6786. [Google Scholar] [CrossRef]
  55. Jiang, Y.; Ning, K.; Pan, Z.; Shen, X.; Ni, J.; Yu, W.; Schneider, A.; Chen, H.; Nevmyvaka, Y.; Song, D. Multi-modal Time Series Analysis: A Tutorial and Survey. arXiv 2025, arXiv:2503.13709. [Google Scholar] [CrossRef]
  56. Chang, C.; Hwang, J.; Shi, Y.; Wang, H.; Peng, W.C.; Chen, T.F.; Wang, W. Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series. arXiv 2025, arXiv:2506.10412. [Google Scholar] [CrossRef]
  57. Joyce, D.W.; Kormilitzin, A.; Smith, K.; Cipriani, A. Explainable artificial intelligence for mental health through transparency and interpretability for understandability. npj Digit. Med. 2023, 6, 6. [Google Scholar] [CrossRef]
  58. Probierz, B.; Straś, A.; Rodek, P.; Kozak, J. Explainable ai in psychiatry. In Explainable Artificial Intelligence for Sustainable Development; Routledge: London, UK, 2025; pp. 245–262. [Google Scholar] [CrossRef]
  59. Alowais, S.A.; Alghamdi, S.S.; Alsuhebany, N.; Alqahtani, T.; Alshaya, A.I.; Almoaiqel, M. Revolutionizing healthcare: The role of artificial intelligence in clinical practice. BMC Med. Educ. 2023, 23, 689. [Google Scholar] [CrossRef]
  60. Tilala, M.H.; Chenchala, P.K.; Choppadandi, A.; Kaur, J.; Naguri, S.; Saoji, R.; Devaguptapu, B. Ethical considerations in the use of artificial intelligence and machine learning in health care: A comprehensive review. Cureus 2024, 16, e62443. [Google Scholar] [CrossRef]
  61. Bauer, M.; Glenn, T.; Monteith, S.; Bauer, R.; Whybrow, P.C.; Geddes, J. Ethical perspectives on recommending digital technology for patients with mental illness. Int. J. Bipolar Disord. 2017, 5, 6. [Google Scholar] [CrossRef] [PubMed]
  62. Ratti, E.; Morrison, M.; Jakab, I. Ethical and social considerations of applying artificial intelligence in healthcare—A two-pronged scoping review. BMC Med. Ethics 2025, 26, 68. [Google Scholar] [CrossRef]
  63. Rieke, N.; Hancox, J.; Li, W.; Milletarì, F.; Roth, H.R.; Albarqouni, S.; Bakas, S.; Galtier, M.; Landman, B.A.; Maier-Hein, K.; et al. The future of digital health with federated learning. npj Digit. Med. 2020, 3, 119. [Google Scholar] [CrossRef] [PubMed]
  64. Oladimeji, O.; Oladimeji, T.; Abiodun, O. Artificial intelligence in mental health: A review of current trends and future directions. J. Ment. Health Clin. Psychol. 2023, 7, 65–72. [Google Scholar] [CrossRef]
  65. Avula, V.C.R.; Amalakanti, S. Artificial intelligence in psychiatry, present trends, and challenges: An updated review. Arch. Ment. Health 2023, 25, 85–90. [Google Scholar] [CrossRef]
  66. Alhuwaydi, A.M. Exploring the role of artificial intelligence in mental healthcare: Current trends and future directions—A narrative review for a comprehensive insight. Risk Manag. Healthc. Policy 2024, 17, 1339–1348. [Google Scholar] [CrossRef]
  67. Hoose, S.; Králiková, K. Artificial intelligence in mental health care: Management implications, ethical challenges, and policy considerations. Adm. Sci. 2024, 14, 227. [Google Scholar] [CrossRef]
  68. Cruz-Gonzalez, P.; He, A.; Lam, E.K.M.; Ng, I.A.T.; Li, M.; Hou, R.; Chan, J.N.; Sahni, Y.; Viñas-Guasch, N.; Miller, T.; et al. Artificial intelligence in mental health care: A systematic review of diagnosis, monitoring, and intervention applications. Psychol. Med. 2025, 55, e18. [Google Scholar] [CrossRef] [PubMed]
  69. Pfisterer, F. Algorithmic fairness. In Applied Machine Learning Using Mlr3 in R; Chapman and Hall/CRC: Boca Raton, FL, USA, 2023; pp. 316–324. [Google Scholar] [CrossRef]
  70. Summerton, N.; Cansdale, M. Artificial intelligence and diagnosis in general practice. Br. J. Gen. Pract. 2019, 69, 324–325. [Google Scholar] [CrossRef] [PubMed]
  71. Jones, C.; Thornton, J.; Wyatt, J.C. Artificial intelligence and clinical decision support: Clinicians’ perspectives on trust, trustworthiness, and liability. Med. Law Rev. 2023, 31, 501–520. [Google Scholar] [CrossRef]
  72. Wang, X.; Zhang, Y.; Zhu, R. A brief review on algorithmic fairness. Manag. Syst. Eng. 2022, 1, 7. [Google Scholar] [CrossRef]
  73. Morley, J.; Machado, C.C.V.; Burr, C.; Cowls, J.; Joshi, I.; Taddeo, M.; Floridi, L. The ethics of ai in health care: A mapping review. Soc. Sci. Med. 2020, 260, 113172. [Google Scholar] [CrossRef]
  74. Rajkomar, A.; Hardt, M.; Howell, M.; Corrado, G.S.; Chin, M.H. Ensuring fairness in machine learning to advance health equity. Ann. Intern. Med. 2018, 169, 866–872. [Google Scholar] [CrossRef]
  75. Reddy, S. Generative ai in healthcare: An implementation science informed translational path on application, integration and governance. Implement. Sci. 2024, 19, 27. [Google Scholar] [CrossRef]
  76. Mucci, F.; Marazziti, D. Artificial Intelligence in Neuropsychiatry: A Potential Beacon in an Ocean of Uncertainty? Clin. Neuropsychiatry 2023, 20, 467–471. [Google Scholar]
  77. Ray, A.; Bhardwaj, A.; Malik, Y.K.; Singh, S.; Gupta, R. Artificial intelligence and psychiatry: An overview. Asian J. Psychiatry 2022, 70, 103021. [Google Scholar] [CrossRef]
  78. World Health Organization. WHO Calls for Safe and Ethical AI for Health. 2023. Available online: https://www.who.int/news/item/16-05-2023-who-calls-for-safe-and-ethical-ai-for-health (accessed on 28 June 2025).
  79. Mishra, R.; Satpathy, R.; Pati, B. Human computer interaction applications in healthcare: An integrative review. EAI Endorsed Trans. Pervasive Health Technol. 2023, 9, 1–10. [Google Scholar] [CrossRef]
  80. Zhao, X.; Zhang, S.; Nan, D.; Han, J.; Kim, J.H. Human–computer interaction in healthcare: A bibliometric analysis with citespace. Healthcare 2024, 12, 2467. [Google Scholar] [CrossRef] [PubMed]
  81. Caetano, R.; Oliveira, J.M.; Ramos, P. Transformer-based models for probabilistic time series forecasting with explanatory variables. Mathematics 2025, 13, 814. [Google Scholar] [CrossRef]
Figure 1. Overview of the Prognostic Framework.
Figure 1. Overview of the Prognostic Framework.
Applsci 16 00763 g001
Table 1. The Patient State Vector (PSV) Data Dictionary.
Table 1. The Patient State Vector (PSV) Data Dictionary.
DomainSub-DomainData SourceRaw Metric
ClinicalSymptom - SeverityPHQ-9 QuestionnaireSum of item scores
ComorbidityElectronic Health RecordICD-10 Codes
BiologicalPharmacogene-ticsGenetic AssayCYP2D6 genotype
Neuroendo-crinologySaliva/Blood SampleCortisol concentration
Digital PhenotypeMobilitySmartphone GPSLatitude/ Longitude
Smartphone GPSLatitude/ Longitude
Social ActivitySmartphone Call LogCall metadata
Smartphone SMS LogSMS metadata
Circadian RhythmsSmartphone ScreenScreen on/off timestamps
Wearable Accelerometer3-axis acceleration
Physical ActivityWearable AccelerometerStep detection
DomainEngineered FeatureSampling Freq.Theoretical Link to MDD
ClinicalWeekly PHQ-9 ScoreWeeklyCore measure of depression severity
Binary flags for anxiety, SUDsStatic (Baseline)Comorbidity impacts prognosis
BiologicalPoor/Intermediate/ Normal/Ultra-rapid metaboliser statusStatic (Baseline)Influences drug exposure/side effects
Baseline cortisol levelStatic (Baseline)HPA axis dysregulation
Digital PhenotypeLocation EntropyDaily (from Hz data)Anhedonia, avolition, behavioural withdrawal
Time Spent at HomeDaily (from Hz data)Social withdrawal
Number of unique contactsDailySocial network size/engagement
Ratio of incoming / outgoing textsDailySocial reciprocity
Inferred sleep durationDailySleep disturbance, insomnia/hypersomnia
Circadian Regularity IndexDaily (from Hz data)Disruption of daily routines
Daily Step CountDailyPsychomotor retardation/agitation
Table 2. Model Evaluation Benchmarks.
Table 2. Model Evaluation Benchmarks.
ModelInput DataEvaluation Metrics
Multiple Linear Regression (Baseline)Baseline PSV only (static features)Endpoint MAE, AUC at 12 weeks
ARIMAUnivariate time-series of clinical scores onlyTrajectory RMSE, Endpoint MAE
Standard LSTMFull time-series of PSV + TIF (no time-awareness)Trajectory RMSE, Endpoint MAE, DTW
Time-Aware LSTM (Proposed)Full time-series of PSV + TIF (with time-awareness)Trajectory RMSE, Endpoint MAE, DTW
Table 3. Implementation Roadmap: Phases, Risks, and Mitigations.
Table 3. Implementation Roadmap: Phases, Risks, and Mitigations.
Implementation PhaseKey DeliverablePrimary RiskMethodological Mitigation
Phase 1: Data ArchitectureConstructed PSV & TIF vectors.Sparsity & Noise: Sensors fail; patients skip surveys.Impute missingness via time-decay gates; use “masking” layers in LSTM training.
Phase 2: Model TrainingTrained Time-Aware LSTM.Overfitting: Model memorises specific patient histories.Nested Cross-Validation (patient-level split); Regularisation (Dropout).
Phase 3: Explainability (XAI)SHAP plots & Attention Maps.Misinterpretation: Clinicians over-rely on false signals.Counterfactual testing (“What if?”); Uncertainty quantification (Confidence Intervals).
Phase 4: Clinical IntegrationDecision Support Interface.Alert Fatigue: Too many false alarms.Set high specificity thresholds; co-design interface with clinicians (HCD).
Phase 5: GovernanceBias Audit Report.Algorithmic Bias: Model fails for minority groups.Mandatory performance disaggregation by demographic group; Federated Learning for privacy.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ngabo-Woods, H.; Dunai, L.; Verdú, I.S.; Liang, S. A Multimodal Framework for Prognostic Modelling of Mental Health Treatment and Recovery Trajectories. Appl. Sci. 2026, 16, 763. https://doi.org/10.3390/app16020763

AMA Style

Ngabo-Woods H, Dunai L, Verdú IS, Liang S. A Multimodal Framework for Prognostic Modelling of Mental Health Treatment and Recovery Trajectories. Applied Sciences. 2026; 16(2):763. https://doi.org/10.3390/app16020763

Chicago/Turabian Style

Ngabo-Woods, Harold, Larisa Dunai, Isabel Seguí Verdú, and Sui Liang. 2026. "A Multimodal Framework for Prognostic Modelling of Mental Health Treatment and Recovery Trajectories" Applied Sciences 16, no. 2: 763. https://doi.org/10.3390/app16020763

APA Style

Ngabo-Woods, H., Dunai, L., Verdú, I. S., & Liang, S. (2026). A Multimodal Framework for Prognostic Modelling of Mental Health Treatment and Recovery Trajectories. Applied Sciences, 16(2), 763. https://doi.org/10.3390/app16020763

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop