Deep Learning Approaches to Natural Language Processing for Digital Twins of Patients in Psychiatry and Neurological Rehabilitation
Abstract
:1. Introduction
- Data ingest: The framework starts with securely ingesting multimodal clinical data, including unstructured text (e.g., therapist notes, progress reports) and structured data (e.g., diagnostic codes, medication history, neuropsychological scores), typically uploaded to a cloud environment;
- Preprocessing and anonymization: The raw data are preprocessed to clean and normalize text, tokenize language, and anonymize sensitive patient identifiers to ensure compliance with privacy regulations (e.g., HIPAA, GDPR);
- Model training and tuning: Pre-trained transformer-based NLP models (e.g., BERT, GPT) are tuned on a domain-specific dataset to learn semantic and clinical representations relevant to psychiatry and neurological rehabilitation;
- Digital twin generation: For each patient, a digital twin is created by integrating current clinical data with historical patterns, using a hybrid model that combines deep NLP embeddings with temporal models (e.g., LSTM or Transformer encoders);
- API layer exposure: A service interface (REST API) provides endpoints that allow the client to request predictions, simulate treatment outcomes, or retrieve current digital twin profiles for specific patients;
- User interaction via user interface: Physicians/researchers access the system via a web-based UI, where they can enter new data, review NLP-generated insights, and visualize predicted patient trajectories or intervention outcomes;
- Continuous learning and updating: As new data are introduced (e.g., after each treatment session or assessment), the system updates the digital twin in real time, periodically retraining or fine-tuning models to improve accuracy and personalization.
1.1. Scientific Problem
1.2. Observed Gaps and Challenges
2. Methods Used
2.1. Concept of the Study
2.2. The Dataset and Its Preprocessing
- Missing values in structured data: the framework used a hybrid imputation strategy depending on the variable type:
- Numeric clinical metrics (e.g., test scores, biometrics) were imputed using mean or median imputation, where appropriate, or using more advanced methods such as k-nearest neighbor imputation to preserve patient-level patterns;
- Categorical variables (e.g., diagnosis codes, medication types) used mode imputation or introduced a “Missing” category to preserve model interpretability;
- Missing or incomplete text notes: In cases where clinician notes or treatment reports were incomplete or missing at specific time points:
- The framework implemented temporal smoothing by propagating earlier notes forward with timestamps while marking them as imputed to distinguish the original content;
- To prevent the model from overfitting duplicate text, these placeholders were reduced during training;
- Outlier detection and noise reduction: Structured outliers were identified using z-score thresholds or interquartile range (IQR) methods and were corrected (if clearly erroneous) or removed. For text data, irrelevant or inconsistent sentences (e.g., boilerplate headings, unrelated templates) were filtered out using regular expressions and a simple rule-based natural language cleaner;
- Data quality audits and logging: A validation protocol flagged inconsistencies, such as mismatched timestamps or conflicting diagnoses, and these were then manually checked or corrected using logical rules when patterns were clear;
- Model robustness to missing data: Models were trained using masked input representations to simulate real-world missingness, ensuring that the system remained robust when handling incomplete patient profiles during inference;
- Documentation and transparency: All preprocessing decisions were logged, version-controlled, and documented, providing transparency and repeatability in both model development and clinical interpretation.
2.3. Computational Tools and Statistical Methods
3. Results of Studies
4. Discussion
4.1. Scientific Significance of the Study
4.2. Clinical and Economic Significance of the Study
4.3. Social and Ethical Significance of the Study
4.4. Limitations
4.5. Directions for Further Research
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AI | Artificial Intelligence |
BERT | Bidirectional Encoder Representations from Transformers |
DL | Deep Learning |
GDPR | General Data Protection Regulation |
GPT | General Pretrained Trasformer |
HIPAA | Health Insurance Portability and Accountability Act |
NLP | Natural Language Processing |
XAI | eXplainable Artificial Intelligence |
References
- Xian, X.; Chang, A.; Xiang, Y.T.; Liu, M.T. Debate and Dilemmas Regarding Generative AI in Mental Health Care: Scoping Review. Interact. J. Med. Res. 2024, 13, e53672. [Google Scholar] [CrossRef] [PubMed]
- Sezgin, E.; McKay, I. Behavioral health and generative AI: A perspective on future of therapies and patient care. NPJ Ment. Health Res. 2024, 3, 25. [Google Scholar] [CrossRef]
- Banerjee, S.; Dunn, P.; Conard, S.; Ali, A. Mental Health Applications of Generative AI and Large Language Modeling in the United States. Int. J. Environ. Res. Public Health 2024, 21, 910. [Google Scholar] [CrossRef] [PubMed]
- Timmons, A.C.; Duong, J.B.; Simo Fiallo, N.; Lee, T.; Vo, H.P.Q.; Ahle, M.W.; Comer, J.S.; Brewer, L.C.; Frazier, S.L.; Chaspari, T. A Call to Action on Assessing and Mitigating Bias in Artificial Intelligence Applications for Mental Health. Perspect. Psychol. Sci. 2023, 18, 1062–1096. [Google Scholar] [CrossRef] [PubMed]
- Blease, C.; Torous, J. ChatGPT and mental healthcare: Balancing benefits with risks of harms. BMJ Ment. Health 2023, 26, e300884. [Google Scholar] [CrossRef]
- Smith, K.A.; Hardy, A.; Vinnikova, A.; Blease, C.; Milligan, L.; Hidalgo-Mazzei, D.; Lambe, S.; Marzano, L.; Uhlhaas, P.J.; Ostinelli, E.G.; et al. Digital Mental Health for Schizophrenia and Other Severe Mental Illnesses: An International Consensus on Current Challenges and Potential Solutions. JMIR Ment. Health 2024, 11, e57155. [Google Scholar] [CrossRef]
- Khosravi, M.; Izadi, R.; Azar, G. Factors Influencing the Engagement with Electronic Mental Health Technologies: A Systematic Review of Reviews. Adm. Policy Ment. Health 2024, 52, 415–427. [Google Scholar] [CrossRef]
- Le Glaz, A.; Haralambous, Y.; Kim-Dufor, D.H.; Lenca, P.; Billot, R.; Ryan, T.C.; Marsh, J.; DeVylder, J.; Walter, M.; Berrouiguet, S.; et al. Machine Learning and Natural Language Processing in Mental Health: Systematic Review. J. Med. Internet Res. 2021, 23, e15708. [Google Scholar] [CrossRef]
- O’Leary, A.; Lahey, T.; Lovato, J.; Loftness, B.; Douglas, A.; Skelton, J.; Cohen, J.G.; Copeland, W.E.; McGinnis, R.S.; McGinnis, E.W. Using Wearable Digital Devices to Screen Children for Mental Health Conditions: Ethical Promises and Challenges. Sensors 2024, 24, 3214. [Google Scholar] [CrossRef]
- Mikołajewska, E.; Mikołajewski, D. Integrated IT environment for people with disabilities: A new concept. Cent. Eur. J. Med. 2014, 9, 177–182. [Google Scholar] [CrossRef]
- Eyre, H.; Alba, P.R.; Gibson, C.J.; Gatsby, E.; Lynch, K.E.; Patterson, O.V.; DuVall, S.L. Bridging information gaps in menopause status classification through natural language processing. JAMIA Open 2024, 7, ooae013. [Google Scholar] [CrossRef] [PubMed]
- Wang, Y.; Yu, Y.; Liu, Y.; Ma, Y.; Pang, P.C. Predicting Patients’ Satisfaction with Mental Health Drug Treatment Using Their Reviews: Unified Interchangeable Model Fusion Approach. JMIR Ment. Health 2023, 10, e49894. [Google Scholar] [CrossRef] [PubMed]
- Ikram, M.; Shaikh, N.F.; Vishwanatha, J.K.; Sambamoorthi, U. Leading Predictors of COVID-19-Related Poor Mental Health in Adult Asian Indians: An Application of Extreme Gradient Boosting and Shapley Additive Explanations. Int. J. Environ. Res. Public Health 2022, 20, 775. [Google Scholar] [CrossRef]
- Engineer, M.; Kot, S.; Dixon, E. Investigating the Readability and Linguistic, Psychological, and Emotional Characteristics of Digital Dementia Information Written in the English Language: Multitrait-Multimethod Text Analysis. JMIR Form. Res. 2023, 7, e48143. [Google Scholar] [CrossRef]
- Crema, C.; Attardi, G.; Sartiano, D.; Redolfi, A. Natural language processing in clinical neuroscience and psychiatry: A review. Front. Psychiatry 2022, 13, 946387. [Google Scholar] [CrossRef]
- Romano, M.F.; Shih, L.C.; Paschalidis, I.C.; Au, R.; Kolachalama, V.B. Large Language Models in Neurology Research and Future Practice. Neurology 2023, 101, 1058–1067. [Google Scholar] [CrossRef] [PubMed]
- Darer, J.D.; Pesa, J.; Choudhry, Z.; Batista, A.E.; Parab, P.; Yang, X.; Govindarajan, R. Characterizing Myasthenia Gravis Symptoms, Exacerbations, and Crises from Neurologist’s Clinical Notes Using Natural Language Processing. Cureus 2024, 16, e65792. [Google Scholar] [CrossRef]
- Ariño, H.; Bae, S.K.; Chaturvedi, J.; Wang, T.; Roberts, A. Identifying encephalopathy in patients admitted to an intensive care unit: Going beyond structured information using natural language processing. Front. Digit. Health 2023, 5, 1085602. [Google Scholar] [CrossRef]
- Katsoulakis, E.; Wang, Q.; Wu, H.; Shahriyari, L.; Fletcher, R.; Liu, J.; Achenie, L.; Liu, H.; Jackson, P.; Xiao, Y.; et al. Digital twins for health: A scoping review. NPJ Digit. Med. 2024, 7, 77. [Google Scholar] [CrossRef]
- Wang, H.; Arulraj, T.; Ippolito, A.; Popel, A.S. From virtual patients to digital twins in immuno-oncology: Lessons learned from mechanistic quantitative systems pharmacology modeling. NPJ Digit. Med. 2024, 7, 189. [Google Scholar] [CrossRef]
- Sandrone, S. Digital Twins in Neuroscience. J. Neurosci. 2024, 44, e0932242024. [Google Scholar] [CrossRef] [PubMed]
- Barnova, K.; Mikolasova, M.; Kahankova, R.V.; Pelc, M.; Martinek, R. Implementation of artificial intelligence and machine learning-based methods in brain–computer interaction. Comput. Biol. Med. 2023, 163, 107135. [Google Scholar] [CrossRef] [PubMed]
- Świetlicka, A.; Gugała, K.; Jurkowlaniec, A.; Śniatała, P.; Rybarczyk, A. The stochastic, Markovian, Hodgkin-Huxley type of mathematical model of the neuron. Neural Netw. World 2015, 25, 219–239. [Google Scholar] [CrossRef]
- Rojek, I.; Mikołajewski, D.; Dostatni, E.; Kopowski, J. Specificity of 3D Printing and AI-Based Optimization of Medical Devices Using the Example of a Group of Exoskeletons. Appl. Sci. 2023, 13, 1060. [Google Scholar] [CrossRef]
- Mikołajczyk, T.; Kłodowski, A.; Mikołajewska, E.; Fausti, D.; Petrogalli, G. Design and control of system for elbow rehabilitation: Preliminary findings. Adv. Clin. Exp. Med. 2018, 27, 1661–1669. [Google Scholar] [CrossRef]
- Błońska, B.K.; Gotlib, J. Prevalence of sleep disorders among students. Prz. Med. Uniw. Rzesz. Nar. Inst. Leków Warszawie 2012, 4, 485–497. [Google Scholar]
- Sai, S.; Gaur, A.; Sai, R.; Chamola, V.; Guizani, M.; Rodrigues, J.J.P.C. Generative AI for Transformative Healthcare: A Comprehensive Study of Emerging Models, Applications, Case Studies, and Limitations. IEEE Access 2024, 12, 31078–31106. [Google Scholar] [CrossRef]
- Thacharodi, A.; Singh, P.; Meenatchi, R.; Tawfeeq Ahmed, Z.H.; Kumar, R.R.S.; V, N.; Kavish, S.; Maqbool, M.; Hassan, S. Revolutionizing healthcare and medicine: The impact of modern technologies for a healthier future—A comprehensive review. Health Care Sci. 2024, 3, 329–349. [Google Scholar] [CrossRef]
- Mroczkowska, R.; Szlenk-Czyczerska, E.; Szwamel, K.; Fiszer, R. Mediation role of health behaviours in the relation between mental resilience and cardiovascular risk in young adults with a diagnosed congenital heart defect. BMC Public Health 2025, 25, 943. [Google Scholar] [CrossRef]
- Adibi, S.; Rajabifard, A.; Shojaei, D.; Wickramasinghe, N. Enhancing Healthcare through Sensor-Enabled Digital Twins in Smart Environments: A Comprehensive Analysis. Sensors 2024, 24, 2793. [Google Scholar] [CrossRef]
- Egger, J.; de Paiva, L.F.; Luijten, G.; Krittanawong, C. Is DeepSeek-R1 a Game Changer in Healthcare?—A Seed Review. TechRxiv 2025. [Google Scholar] [CrossRef]
- Kawala-Janik, A.; Bauer, W.; Al-Bakri, A.; Cichon, K.; Podraza, W. Implementation of low-pass fractional filtering for the purpose of analysis of electroencephalographic signals. Lect. Notes Electr. Eng. 2019, 496, 63–73. [Google Scholar]
- Noeikham, P.; Buakum, D.; Sirivongpaisal, N. Architecture designing of digital twin in a healthcare unit. Health Inform. J. 2024, 30, 14604582241296792. [Google Scholar] [CrossRef]
- Talal, M.; Zaidan, A.A.; Zaidan, B.B.; Albahri, A.S.; Alamoodi, A.H.; Albahri, O.S.; Alsalem, M.A.; Lim, C.K.; Tan, K.L.; Shir, W.L.; et al. Smart Home-based IoT for Real-time and Secure Remote Health Monitoring of Triage and Priority System using Body Sensors: Multi-driven Systematic Review. J. Med. Syst. 2019, 43, 42. [Google Scholar] [CrossRef] [PubMed]
- Jat, A.S.; Grønli, T.M.; Ghinea, G.; Assres, G. Evolving Software Architecture Design in Telemedicine: A PRISMA-based Systematic Review. Healthc. Inform. Res. 2024, 30, 184–193. [Google Scholar] [CrossRef] [PubMed]
- Mohsin, A.H.; Zaidan, A.A.; Zaidan, B.B.; Albahri, A.S.; Albahri, O.S.; Alsalem, M.A.; Mohammed, K.I. Real-Time Remote Health Monitoring Systems Using Body Sensor Information and Finger Vein Biometric Verification: A Multi-Layer Systematic Review. J. Med. Syst. 2018, 42, 238. [Google Scholar] [CrossRef]
- Świetlicka, A.; Kolanowski, K. Homogeneous ensemble model built from artificial neural networks for fault detection in navigation systems. J. Comput. Appl. Math. 2023, 432, 115279. [Google Scholar] [CrossRef]
- Alhammad, N.; Alajlani, M.; Abd-Alrazaq, A.; Epiphaniou, G.; Arvanitis, T. Patients’ Perspectives on the Data Confidentiality, Privacy, and Security of mHealth Apps: Systematic Review. J. Med. Internet. Res. 2024, 26, e50715. [Google Scholar] [CrossRef]
- Rezaeibagha, F.; Win, K.T.; Susilo, W. A systematic literature review on security and privacy of electronic health record systems: Technical perspectives. Health Inf. Manag. 2015, 44, 23–38. [Google Scholar] [CrossRef]
- Oostendorp, R.A.B.; Bakker, I.; Elvers, H.; De Hertogh, W.; Samwel, H. Cervicogenic somatosensory tinnitus: An indication for manual therapy plus education? Part 2: A pilot study. Man. Ther. 2015, 23, 106–113. [Google Scholar] [CrossRef]
- Kawala-Sterniuk, A.; Pelc, M.; Martinek, R.; Wójcik, G.M. Editorial: Currents in biomedical signals processing—Methods and applications. Front. Neurosci. 2022, 16, 989400. [Google Scholar] [CrossRef]
- Rezaeibagha, F.; Mu, Y. Distributed clinical data sharing via dynamic access-control policy transformation. Int. J. Med. Inform. 2016, 89, 25–31. [Google Scholar] [CrossRef] [PubMed]
- Alsahli, S.; Hor, S.Y.; Lam, M. Factors Influencing the Acceptance and Adoption of Mobile Health Apps by Physicians During the COVID-19 Pandemic: Systematic Review. JMIR mHealth uHealth 2023, 11, e50419. [Google Scholar] [CrossRef]
- Albahri, A.S.; Al-Qaysi, Z.T.; Alzubaidi, L.; Alnoor, A.; Albahri, O.S.; Alamoodi, A.H.; Bakar, A.A. A Systematic Review of Using Deep Learning Technology in the Steady-State Visually Evoked Potential-Based Brain-Computer Interface Applications: Current Trends and Future Trust Methodology. Int. J. Telemed. Appl. 2023, 2023, 7741735. [Google Scholar] [CrossRef]
- Albahri, A.S.; Alnoor, A.; Zaidan, A.A.; Albahri, O.S.; Hameed, H.; Zaidan, B.B.; Peh, S.S.; Zain, A.B.; Siraj, S.B.; Masnan, A.H.B.; et al. Hybrid artificial neural network and structural equation modelling techniques: A survey. Complex Intell. Syst. 2022, 8, 1781–1801. [Google Scholar] [CrossRef] [PubMed]
- Mohammed, K.I.; Zaidan, A.A.; Zaidan, B.B.; Albahri, O.S.; Alsalem, M.A.; Albahri, A.S.; Hadi, A.; Hashim, M. Real-Time Remote-Health Monitoring Systems: A Review on Patients Prioritisation for Multiple-Chronic Diseases, Taxonomy Analysis, Concerns and Solution Procedure. J. Med. Syst. 2019, 43, 223. [Google Scholar] [CrossRef] [PubMed]
- Almahdi, E.M.; Zaidan, A.A.; Zaidan, B.B.; Alsalem, M.A.; Albahri, O.S.; Albahri, A.S. Mobile Patient Monitoring Systems from a Benchmarking Aspect: Challenges, Open Issues and Recommended Solutions. J. Med. Syst. 2019, 43, 207. [Google Scholar] [CrossRef]
- Vorisek, C.N.; Lehne, M.; Klopfenstein, S.A.I.; Mayer, P.J.; Bartschke, A.; Haese, T.; Thun, S. Fast Healthcare Interoperability Resources (FHIR) for Interoperability in Health Research: Systematic Review. JMIR Med. Inform. 2022, 10, e35724. [Google Scholar] [CrossRef]
- Panozzo, L.; Harvey, P.; Adams, M.J.; O’Connor, D.; Ward, B. Communication of advance care planning decisions: A retrospective cohort study of documents in general practice. BMC Palliat. Care 2020, 19, 108. [Google Scholar] [CrossRef]
- Antonowicz, P.; Podpora, M.; Rut, J. Digital Stereotypes in HMI—The Influence of Feature Quantity Distribution in Deep Learning Models Training. Sensors 2022, 22, 6739. [Google Scholar] [CrossRef]
- Ayaz, M.; Pasha, M.F.; Alzahrani, M.Y.; Budiarto, R.; Stiawan, D. The Fast Health Interoperability Resources (FHIR) Standard: Systematic Literature Review of Implementations, Applications, Challenges and Opportunities. JMIR Med. Inform. 2021, 9, e21929, Erratum in JMIR Med. Inform. 2021, 9, e32869. [Google Scholar] [CrossRef] [PubMed]
- Klin, B.; Podpora, M.; Beniak, R.; Gardecki, A.; Rut, J. Smart Beamforming in Verbal Human-Machine Interaction for Humanoid Robots. IEEE Robot. Autom. Lett. 2023, 8, 4689–4696. [Google Scholar] [CrossRef]
- Ozga, W.K.; Zapała, D.; Wierzgała, P.; Porzak, R.; Wójcik, G.M. Acoustic Neurofeedback Increases Beta ERD During Mental Rotation Task. Appl. Psychophysiol. Biofeedback 2019, 44, 103–115. [Google Scholar] [CrossRef] [PubMed]
- Bistroń, M.; Piotrowski, Z. Efficient Video Watermarking Algorithm Based on Convolutional Neural Networks with Entropy-Based Information Mapper. Entropy 2023, 25, 284. [Google Scholar] [CrossRef]
- Lenarczyk, P.; Piotrowski, Z. Parallel blind digital image watermarking in spatial and frequency domains. Telecommun. Syst. 2013, 54, 287–303. [Google Scholar] [CrossRef]
- Durneva, P.; Cousins, K.; Chen, M. The Current State of Research, Challenges, and Future Research Directions of Blockchain Technology in Patient Care: Systematic Review. J. Med. Internet. Res. 2020, 22, e18619. [Google Scholar] [CrossRef]
- Park, E.H.; Watson, H.I.; Mehendale, F.V.; O’Neil, A.Q. Clinical Evaluators. Evaluating the Impact on Clinical Task Efficiency of a Natural Language Processing Algorithm for Searching Medical Documents: Prospective Crossover Study. JMIR Med. Inform. 2022, 10, e39616. [Google Scholar] [CrossRef]
- Ming, Y.; Zhang, T. Efficient Privacy-Preserving Access Control Scheme in Electronic Health Records System. Sensors 2018, 18, 3520. [Google Scholar] [CrossRef]
- Entzeridou, E.; Markopoulou, E.; Mollaki, V. Public and physician’s expectations and ethical concerns about electronic health record: Benefits outweigh risks except for information security. Int. J. Med. Inform. 2018, 110, 98–107. [Google Scholar] [CrossRef]
- Sondej, T.; Jannasz, I.; Sieczkowski, K.; Targowski, T.; Olszewski, R. Validation of a new device for photoplethysmographic measurement of multi-site arterial pulse wave velocity. Biocybern. Biomed. Eng. 2021, 41, 1664–1684. [Google Scholar] [CrossRef]
- Jeyaraman, N.; Jeyaraman, M.; Yadav, S.; Ramasubramanian, S.; Balaji, S.; Muthu, S.; Lekha, P.C.; Patro, B.P. Applications of Fog Computing in Healthcare. Cureus 2024, 16, e64263. [Google Scholar] [CrossRef]
- Rozanowski, K.; Sondej, T.; Lewandowski, J. First approach for design of an autonomous measurement system to aid determination of the psychological profile of soldiers. In Proceedings of the 22nd International Conference Mixed Design of Integrated Circuits and Systems, MIXDES 2015, Torun, Poland, 25–27 June 2015; pp. 53–57. [Google Scholar]
- Carrasco Ramirez, J.G. AI in Healthcare: Revolutionizing Patient Care with Predictive Analytics and Decision Support Systems. J. Artif. Intell. Gen. Sci. 2024, 1, 31–37. [Google Scholar] [CrossRef]
- Shi, J.; Yuan, R.; Yan, X.; Wang, M.; Qiu, J.; Ji, X.; Yu, G. Factors Influencing the Sharing of Personal Health Data Based on the Integrated Theory of Privacy Calculus and Theory of Planned Behaviors Framework: Results of a Cross-Sectional Study of Chinese Patients in the Yangtze River Delta. J. Med. Internet. Res. 2023, 25, e46562. [Google Scholar] [CrossRef] [PubMed]
- Mendis, L.; Karmakar, D.; Palaniswami, D.; Brownfoot, F.; Keenan, E. Cross-Database Evaluation of Deep Learning Methods for Intrapartum Cardiotocography Classification. IEEE J. Transl. Eng. Health Med. 2025, 13, 123–135. [Google Scholar] [CrossRef]
- Al-Issa, Y.; Ottom, M.A.; Tamrawi, A. eHealth Cloud Security Challenges: A Survey. J. Healthc. Eng. 2019, 2019, 7516035. [Google Scholar] [CrossRef]
- Abbas, A.; Khan, S.U. A review on the state-of-the-art privacy-preserving approaches in the e-health clouds. IEEE J. Biomed. Health Inform. 2014, 18, 1431–1441. [Google Scholar] [CrossRef]
- Alabdulatif, A.; Khalil, I.; Mai, V. Protection of electronic health records (EHRs) in cloud. In Proceedings of the 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Osaka, Japan, 3–7 July 2013; pp. 4191–4194. [Google Scholar] [CrossRef]
Area | Limitation | Description |
---|---|---|
Dataset quality limitations | Data scarcity and fragmentation | Psychiatric and neurological data are often fragmented across systems (EHRs, therapy notes, imaging centers). Unstructured text formats dominate (e.g., physician notes, patient diaries), making standardization and preprocessing for DL models difficult. |
Lack of annotated clinical NLP corpora | Annotated datasets in the psychiatry and neurological rehabilitation domains are rare due to privacy constraints and annotation complexity. Domain-specific terminology and specific expressions used by physicians require expert annotators, which increases costs and time. | |
Noise and ambiguity in language | Clinical notes often contain abbreviations, colloquialisms, incomplete sentences, or idiosyncratic language that is not suitable for general NLP models. Psychiatric data include subjective descriptions, making the quality and consistency of labels poor. | |
Demographic and linguistic bias | Existing datasets may underrepresent certain age groups, socioeconomic backgrounds, or non-native speakers, leading to biased models that do not perform well across populations. | |
Challenges of real-world implementation | Model generalizability and robustness | DL models trained on data from one institution may not generalize to another due to differences in language use, documentation style, and patient demographics. Minor changes in clinical wording or patient populations can lead to significant performance degradation. |
Data privacy and ethical barriers | Stringent regulations (e.g., HIPAA, GDPR) limit access to longitudinal patient records. Real-time implementation raises concerns about consent, re-identification, and data management, especially for sensitive psychiatric information. | |
Interpretability and clinician trust | DL models (especially transformers such as BERT- or GPT-based models) are black boxes, making clinicians wary of using their results to make critical decisions. Lack of explainability hinders clinical adoption and regulatory approval. | |
Integration into clinical workflows | Embedding NLP tools into hospital systems in real time is technically challenging and may require custom APIs, EHR vendor collaboration, and workflow redesign. Risk of alert fatigue or misalignment with clinical priorities exists if models are not context-aware. | |
Limitations of interdisciplinary coordination | Misalignment of goals and metrics | Clinicians prioritize patient outcomes and safety, while data scientists may focus on metrics such as F1 score, leading to misalignment in model evaluation. There is difficulty matching what is clinically meaningful with what can be measured statistically. |
Communication barriers | Clinical experts may not be trained in AI concepts; data scientists may misunderstand medical nuances. Collaborative design of digital twins often suffers from terminological ambiguity and disciplinary silos. | |
Time and resource constraints | Clinicians often have limited time for iterative feedback on model development or annotation tasks. Interdisciplinary teams require ongoing funding and coordination, which is rare in many research settings. | |
Validation and continuous update | Ensuring the continued clinical validity of digital twins requires long-term collaboration and frequent retraining, which can be difficult to coordinate across disciplines. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mikołajewska, E.; Masiak, J. Deep Learning Approaches to Natural Language Processing for Digital Twins of Patients in Psychiatry and Neurological Rehabilitation. Electronics 2025, 14, 2024. https://doi.org/10.3390/electronics14102024
Mikołajewska E, Masiak J. Deep Learning Approaches to Natural Language Processing for Digital Twins of Patients in Psychiatry and Neurological Rehabilitation. Electronics. 2025; 14(10):2024. https://doi.org/10.3390/electronics14102024
Chicago/Turabian StyleMikołajewska, Emilia, and Jolanta Masiak. 2025. "Deep Learning Approaches to Natural Language Processing for Digital Twins of Patients in Psychiatry and Neurological Rehabilitation" Electronics 14, no. 10: 2024. https://doi.org/10.3390/electronics14102024
APA StyleMikołajewska, E., & Masiak, J. (2025). Deep Learning Approaches to Natural Language Processing for Digital Twins of Patients in Psychiatry and Neurological Rehabilitation. Electronics, 14(10), 2024. https://doi.org/10.3390/electronics14102024