A Pilot Study Using Natural Language Processing to Explore Textual Electronic Mental Healthcare Data

Delanerolle, Gayathri; Bouchareb, Yassine; Shetty, Suchith; Cavalini, Heitor; Phiri, Peter

doi:10.3390/informatics12010028

Open AccessArticle

A Pilot Study Using Natural Language Processing to Explore Textual Electronic Mental Healthcare Data

by

Gayathri Delanerolle

¹

,

Yassine Bouchareb

²

,

Suchith Shetty

¹,

Heitor Cavalini

¹

and

Peter Phiri

^1,3,*

¹

Research & Innovation Department, Hampshire & Isle of Wight Healthcare NHS Foundation Trust, Southampton SO30 3JB, UK

²

College of Medicine and Health Sciences, Sultan Qaboos University, Muscat 123, Oman

³

Psychology Department, University of Southampton, Southampton SO17 1PS, UK

^*

Author to whom correspondence should be addressed.

Informatics 2025, 12(1), 28; https://doi.org/10.3390/informatics12010028

Submission received: 17 October 2024 / Revised: 3 March 2025 / Accepted: 11 March 2025 / Published: 13 March 2025

Download

Browse Figures

Review Reports Versions Notes

Abstract

Mental health illness is the single biggest cause of inability within the UK, contributing up to 22.8% of the whole burden compared to 15.9% for cancer and 16.2% for cardiovascular disease. The more extensive financial costs of mental ailments in Britain have been evaluated at British Pound Sterling (GBP) 105.2 billion each year. This burden could be decreased with productive forms and utilization of computerized innovations. Electronical health records (EHRs), for instance, could offer an extraordinary opportunity for research and provide improved and optimized care. Consequently, this technological advance would unburden the mental health system and help provide optimized and efficient care to the patients. Using natural language processing methods to explore unstructured EHR text data from mental health services in the National Health Service (NHS) UK brings opportunities and technical challenges in the use of such data and possible solutions. This descriptive study compared technical methods and approaches to leverage large-scale text data in EHRs of mental health service providers in the NHS. We conclude that the method used is suitable for mental health services. However, broader studies including other hospital sites are still needed to validate the method.

Keywords:

mental health; electronic health records; natural language processing

1. Introduction

Over a billion people suffer from mental health conditions, with an increasing global burden of disease [1]. The National Health Service (NHS) of the UK reported that 2.8 million people in England were in contact with secondary mental healthcare, learning disability, and autism services at some point during the year 2020–2021 [2]. Approximately 97,000 people were hospitalized. Whilst some of the observed increase in access could be attributed to COVID-19 and lockdown procedures, the overall mental health burden in the UK is increasing. With such a surge in demand and lack of incremental increases in resources, healthcare services are struggling to cope. This burden could be reduced with efficient processes and the use of digital technologies.

From triaging to providing treatment and following up on patient care, there is a scope where technological advances can be used to improve the status quo. One possibility is leveraging advanced artificial intelligence (AI) techniques like machine learning and deep learning models on healthcare data.

With electronic health records (EHRs) being ubiquitously used, volumes of patient data are being collected and stored electronically. Information contained within the EHR data provides a phenomenal opportunity for research and to provide improved and optimized care. However, patient care in the case of mental health is spread across primary, secondary, and tertiary care providers, with each using their own form of EHR system, as illustrated in Figure 1. A study by Dorning et al. [3]. revealed that emergency admissions were four times more likely among people with mental illness than other people. It is imperative that clinicians obtain complete and timely access to patients’ data, and systems that do not offer data sharing across heterogeneous data storage and management systems are a barrier to this. Further, data captured in healthcare IT systems are not in an easy-to-use structured format. Most of the important patient information is in free-text form by clinicians, nurses, and caregivers during their assessment of patients. A lot of meaningful information is held in free-text notes that serve as communication among healthcare professionals in day-to-day clinical practice, but have little to no standardization of format, content, or quality [4,5]. There could be abbreviations, typos, and grammatically unstructured sentences. To manually sift through the unstructured content is time-consuming and impractical on a large scale.

Advanced and growing research in the AI field of natural language processing (NLP) shows its ability to analyze large volumes of unstructured clinical texts. Computational linguistics defines NLP as the use of computers to process natural languages [6]. NLP can also support the “cleaning” requirements prior to analyses conducted in structured or unstructured data. This can be in the form of information extraction (IE) or retrieval (IR), categorization, summarization, translation, etc. Though the early concept of NLP dates back to the 1940s [7], it was primarily based on complex sets of rules and parameters that were time-consuming to apply. In the last two decades, the effective use of NLP has been made possible owing to advances in machine learning-driven algorithms and AI-based methods like deep contextualized word representations [8], transformers [9] and large language models [10].

However, one of the major challenges in using NLP techniques for healthcare is the lack of high-quality training data. AI models are only as good as the data on which they are trained, and a large volume of training data is required to create robust, reproducible, and generalizable models. Non-healthcare applications use crowdsourcing options like Amazon Mechanical Turk [11,12] to annotate training samples. However, that is not an option in healthcare applications, given the sensitivity concerning patients’ clinical data privacy [13]. Thus, the task of annotation relies upon a limited number of domain experts. Table 1 below highlights the advantages and disadvantages/limitations of employing NLP methods to leverage textual information in EHR records. Lately, the availability of anonymized data and the growth in transfer learning approaches in AI models have allowed for continued large-scale research on the use of NLP on clinical data by overcoming some of the limitations. Through transfer learning, AI models trained on one domain can perform well on a different but related domain.

2. Natural Language Processing Pipeline

The implementation of NLP in the analysis of EHR follows a structured pipeline to ensure accuracy, interpretability, and reliability. The key steps in the NLP methodology include the following.

Data preprocessing

EHR data often contain raw and unstructured clinical notes, which must be cleaned before analysis. Preprocessing involves:

De-identification and anonymization to remove personally identifiable information while preserving clinical relevance.
Text normalization, including correction of spelling errors, handling of acronyms, and standardization of abbreviations.
Tokenization, where free-text clinical notes are segmented into individual words or phrases to facilitate analysis.

2.: Annotation and labeling

To develop supervised learning models, annotated datasets are required, hence:

Clinical experts annotate a subset of the dataset to identify medical concepts such as diagnoses, symptoms, medications, and treatment responses.
Inter-annotator agreement measures to ensure consistency and reliability in the labeling.

3.: Model selection and training

Based on the research objectives, various NLP techniques can be employed, such as:

Named-entity recognition (NER) to identify key medical concepts from unstructured text.
Sentiment analysis to assess emotional and psychological indicators within the notes.
Topic modeling to uncover hidden themes and patterns in mental health records.
Text summarization to generate concise representations of extensive patient histories.

These methods are detailed in the following section.

4.: Model validation and evaluation

The accuracy and reliability of the NLP models could be assessed using standard evaluation metrics:

Precision: The proportion of correctly identified concepts out of all identified instances.
Recall: The proportion of correctly identified concepts out of all actual instances in the dataset.
F1 score: The harmonic mean of precision and recall, ensuring a balance between both.

These metrics should be selected to address the challenges of clinical NLP, where both false positives and false negatives can have significant implications for patient care.

5.: Interpretation and integration into clinical research

The final output of the NLP models should be structured into a standardized format, enabling:

Efficient data retrieval for mental health research.
Integration with existing structured EHR fields to enhance clinical decision-making.
Identification of previously under-documented conditions and symptoms in mental health records.

3. Overview of Key Methods and Architectures

We reviewed the key NLP methods and architectures used to explore EHR data, along with a few use cases in the mental health domain over the last few years.

Named-Entity Recognition (NER)

NER is a subtask of information extraction (IE) wherein predefined concepts of interest are identified within free text. In this context, words or phrases are categorized into predefined labels. NER models can thus be used to identify medical concepts, such as drug names, diagnosis, reported symptoms, health scores, etc. For example, Kormilitzin et al. [15] used a pretrained NER model to recognize drug-related categories like drug names, route, frequency, dosage, strength, form, and duration and evaluated the model on electronic medical records sourced from the UK Clinical Record Interactive Search (UK-CRIS) platform, which is the largest secondary care mental health database in the UK. Since the model was originally trained on intensive care unit data in the US, direct application on target mental health data resulted in a significant drop in performance. However, it was shown that fine-tuning the model on a small sample from the UK mental health records (CRIS) delivered reasonable performance on the target data [16].

Sentiment Analysis

Sentiment analysis in NLP refers to categorizing the emotion of a sentence as positive, negative, or neutral. This technique extracts the subjective part of a text (i.e., sentiment) and determines the emotion associated with it. While sentiment analysis is widely applied in areas like online marketing, there is growing research on using the tool to tackle mental health issues. Mental health counseling through digital interventions like chatbots uses sentiment analysis. It can be argued that patients’ states of mind during triage in a clinical setting requires them to be out of their comfort zone discussing topics that potentially carry a social stigma, particularly related to mental health issues. Efficiently designed chatbots can be an alternative to traditional counselors for reluctant patients in such cases [17,18]. For example, users of SimCoach, a virtual human intervention platform introduced in 2011, reported satisfying experiences without distress [19]. However, the evaluation also concluded that there was no proven benefit in intent to seek help from the platform, highlighting the scope for improvement.

Sentiment analysis can be applied to EHR data to gather insights on patient or clinician features that are not explicitly coded. For example, a study of EHR data by McCoy et al. [20] revealed sentiment in discharge notes associated with readmission rates and mortality risk. Languages in clinical narratives have a predominant presence of nouns and less subjectivity compared to other sources like social media data that are typically used to extract sentiment. While 12% to 15% opinionated terms were determined in medical social media, only 5% to 11% were identified in clinical data [21]. Hence, effective usage needs adaptation of existing methods and a domain-specific sentiment score.

Text Summarization

Text summarization reduces a large text to provide an abbreviated narrative representation that conveys all the important information and eliminates the less significant bits. In the EHR data of the Southern Health NHS Foundation Trust (SHFT), a large mental health service provider serving 1.4 million in the Hampshire region in the UK, the median number of progress notes per patient is 40. While abundant information can be potentially helpful, time is a limiting constraint. Often, healthcare professionals do not have the time to process and interpret the entirety of a patient’s historical records [22]. Text summarization tools can help prioritize and align the available content to clinicians’ needs.

The usefulness of biomedical text summarization (BTS) systems to facilitate patients’ information seeking was tested in studies by Liu et al. [23] and Yin et al. [24]. Both studies found that users were satisfied with the representativeness of the summaries by the proposed BTS systems relative to the baseline. A systematic review of research publications related to BTS systems by Wang et al. [25] in 2021 revealed that the performance of BTS systems has improved over the years with the use of hybrid approaches that combine rule-based statistical methods with machine learning and NLP models. However, the review found more systems in the biomedical domain focused on summarizing research literature than EHR data.

Keyword extraction

Keyword extraction or keyword detection refers to the process of automatically extracting the most important words and expressions in a text. In a clinical setting, it can help recognize the main topics discussed while a clinician looks through a patient’s historical notes [26]. Additionally, Tang et al. [27] used keyword extraction as a means to interpret the task of classifying progress notes. Since healthcare professionals are reluctant to buy in on AI models that operate as “black boxes”, methods like keyword extraction that give tangible and interpretable outputs promote confidence in end users. However, there are some limitations to keyword extraction, as it does not capture the meaning of words and suffers from the “curse of dimensionality”, wherein computing efforts increase exponentially with increasing numbers of features.

Aspect-Based Opinion Mining

Aspect-based opinion mining (AOBM) is an extension of sentiment analysis and named-entity recognition in that instead of assigning opinion on a whole sentence, AOBM extracts aspects within a text and identifies opinions associated with each aspect. AOBM tools are typically applied on user reviews/feedbacks of products or services to break down customers’ reactions towards individual parts of the product or service. In the case of clinical data, since a single EHR record is likely to contain information on a patient’s medical history, family medical history, diagnoses, adherence to treatment and interventions, etc., it is beneficial to determine opinion and sentiment distinct to each aspect or component.

Topic modeling

Topic modeling is a form of unsupervised text mining that is used to automatically analyze text data and discover hidden semantic structures. While rule-based systems and supervised learning techniques like NER have been shown to achieve good performance, they require significant manual effort in constructing the rules or predefined labels. Recently, there has been growing attention on unsupervised methods like topic modeling, given their improved performance and ability to discover novel phenotypes [28]. For example, Wang et al. [29] used a topic modeling method called latent Dirichlet allocation (LDA) to discover relationships between bio-terms in the biomedical literature that lead to additional insight into target identification, lead hopping, and drug repurposing. In another example, Gu et al. [30] used LDA on autism surveillance data to uncover causes for the increased prevalence of autism in the US. Such methods can be useful in knowledge discovery that are otherwise latent in EHR data.

There are several ways that NLP techniques could aid data mining from unstructured free text. However, currently there is no NLP software that can be bought off the shelf, fed a sample of clinical notes, and used to produce highly accurate, statistically significant inferences. Large amounts of data are required to train and validate an NLP algorithm to obtain outcomes of interest specific to the target domain. Further, the use of patient data is sensitive, requiring extensive ethics approval and strict adherence to governance frameworks. In the UK, mental health service providers are distributed regionally by independent trust organizations. Data linkage attempts at a national level across these providers were delayed owing to complex legal and governance pathways [31].

NLP architectures for exploring EHR data

Several NLP architectures have been used in EHR data analysis (Table 2). Each method has its advantages and challenges, making the selection of the appropriate architecture critical for accurate and efficient processing. Hybrid models, which combine rule-based approaches with deep learning, are increasingly being used for improved accuracy and scalability.

4. Southern Health NHS Foundation Trust’s NLP Approach for UK-CRIS

With the increased involvement of information technology in the patient–clinician relationship, there is a growing number of opportunities to provide effective and optimized care. As of December 2021, there were over 31 million clinical notes from across 200,000 patients recorded in the EHR system of the Southern Health NHS Foundation Trust (SHFT) since its deployment in 2010. However, only around 30,000 patients have their diagnoses formally recorded in structured fields, while the rest either did not have a formal diagnosis or their diagnoses were only captured in clinical notes, but not entered in the system under structured fields. In another instance, a study to explore EHR data from the Oxford Mental Health NHS Foundation Trust revealed much of the structured non-clinical data on social and behavioral aspects of patients had missing data, making it inadequate for efficient research use [32]. Hence, not leveraging the unstructured clinical notes for research purposes was a missed opportunity. Consequently, the Akrivia Health platform (previously known as UK-CRIS) was developed to provide authorized researchers with regulated, secure access to an entirely anonymized version of the EHR data [33].

The anonymized version of the data allows researchers to access the clinical details for research purposes, but not the personal details of the patients like the name, address, contact details, etc. Figure 2 shows the pipeline setup for the secure processing and access of EHR data. The applications to access the platform and carry out analyses are closely reviewed, monitored, and audited by an oversight committee. Figure 3 shows the workflow process employed to ensure that all research activities follow and comply with the ethical and legal guidelines.

One of the primary benefits of anonymization of EHR data is that it speeds up the process of patient recruitment for research studies [34]. Researchers do not obtain direct access to the medical records of patients without prior consent. As such, the process to verify the eligibility of a patient for a particular study must be through a member of the patient’s clinical team. Alternatively, researchers with research ethics committee (REC)/Health Research Authority (HRA) approvals in place could contact all those patients who had given generic consent to be contacted for research opportunities and then narrow these down to eligible and consenting participants for a particular study. One could see why both these approaches are time-consuming [35]. With the anonymized data, researchers can now obtain prior information on the number of patients who would be eligible for a study and then approach clinicians to assist in their recruitment [34].

Apart from anonymization of the clinical text, NLP tools are implemented to recognize specific details from unstructured text and generate structured data on research-relevant topics that reliably reflect the information recorded in the clinical notes. The medical concepts that are currently extracted include mental health diagnoses, medications, signs and symptoms, and sleep quality. NLP models can be effectively validated by comparing the extracted information with that from human annotators. Human annotators here were people with clinical experience, primarily mental health nurses.

Two primarily used performance measures in NLP models are “precision” and “recall”. While “precision” gives the percentage of positive predictions that are correct, “recall” gives the percentage of positive cases that are correctly predicted as positive. The performance of the model can then be summarized in a single measure called the F1 score, which is the harmonic mean of precision and recall. The F1 scores achieved on the abovementioned medical concepts are shown in Table 3.

Equations to determine the precision, recall, and F1 scores are as shown below.

P r e c i s i o n = \frac{T r u e P o s i t i v e s}{T r u e P o s i t i v e s + F a l s e P o s i t i v e s}

R e c a l l = \frac{T r u e P o s i t i v e s}{T r u e P o s i t i v e s + F a l s e N e g a t i v e s}

F 1 s c o r e = \frac{1}{P r e c i s i o n} + \frac{1}{R e c a l l}

The availability of NLP-extracted structured fields from unstructured clinical notes opens up a whole host of opportunities from a mental health research point of view. For instance, while only 10,675 patients were recorded with a formal diagnosis of dementia in the structured fields of SHFT’s EHR database, an NLP-extracted diagnosis field identified 61,758 patients with dementia [36]. This gives a larger pool of patients to explore dementia-related research. Also, since concepts like medications and symptoms are now available as structured fields, it is possible to explore hypotheses on their association with different forms of mental illness, care pathways, etc.

With a brief understanding of different NLP methods that can potentially be used on free text in the EHR data, Table 4 below highlights some of the studies conducted within the mental health domain in the UK.

5. Recent Advances in NLP Methods in Healthcare

We recognize the evolving landscape of NLP in mental health informatics and highlight promising pathways for future exploration. Recent advancements in NLP, particularly zero-shot and few-shot learning, have shown promising potential in addressing key challenges in clinical text processing, such as the need for extensive manual annotations. Transformer-based architectures like GPT-3, T5, and BERT variants (e.g., Med-BERT, BioBERT) have demonstrated the ability to extract meaningful information from unstructured text with minimal labeled data [43,44]. These models leverage transfer learning and large-scale pretraining on biomedical corpora, enabling them to generalize effectively across various clinical tasks, including NER, sentiment analysis, and modeling [45].

The integration of these advanced techniques could significantly enhance the processing of EHRs. Few-shot learning, for instance, allows models to identify rare clinical conditions and nuanced medical terms even with limited supervision [46]. Additionally, zero-shot learning can facilitate the extraction of relevant information from previously unseen data, enabling faster and more scalable deployment of NLP tools in mental health research. By reducing dependency on large annotated datasets, these models could streamline workflows and improve efficiency in clinical decision-making [47].

Despite their advantages, zero-shot and few-shot transformer models present challenges that need further exploration. Issues such as model interpretability, domain adaptation, and clinical accuracy remain critical concerns [48]. Moreover, these models may exhibit biases, hallucinate information, or struggle with domain-specific medical language if not fine-tuned appropriately [49]. Ethical considerations, particularly regarding patient privacy and governance of AI-generated insights, also require careful attention. Future research should focus on optimizing these models for domain-specific use cases while ensuring compliance with regulatory standards in healthcare AI applications [50].

6. Large-Scale Language Models in Healthcare

Recent advancements in large-scale language models, such as ChatGPT-3.5, have demonstrated significant potential in natural language processing (NLP) applications across various domains, including healthcare [43]. The feasibility of integrating ChatGPT-like models within the secure NHS environment presents both opportunities and challenges. These models could play a transformative role in summarizing medical documents, assisting clinicians in quickly extracting relevant patient information from extensive EHRs, and improving clinical decision-making [51].

One of the most promising applications of ChatGPT-like models in healthcare is their ability to summarize lengthy clinical documents. Clinicians often face an overwhelming volume of patient records, including consultation notes, discharge summaries, and treatment plans. A well-trained chat-based NLP system could efficiently extract key insights, highlight critical information such as recent diagnoses, prescribed medications, and symptom progression, and present a concise summary for rapid review [52]. This would significantly reduce the cognitive burden on clinicians, improve workflow efficiency, and facilitate better patient care.

The implementation of ChatGPT-like models within the NHS would require a carefully designed framework to ensure security, data integrity, and compliance with healthcare regulations [53]. Unlike conventional NLP tools, which may rely on cloud-based processing, the deployment of such models in a clinical setting would necessitate secure, on-premise infrastructure or federated learning approaches to maintain patient data privacy [54]. Additionally, fine-tuning these models using NHS-specific data while adhering to strict governance protocols would be essential to enhance domain adaptation and accuracy in clinical contexts.

Moreover, real-time auditing and explainability mechanisms would need to be integrated to ensure clinicians can trust AI-generated summaries. Mechanisms such as model interpretability tools, user feedback loops, and continuous validation against human-annotated summaries would be critical to maintaining accuracy and reliability in medical applications [50].

The integration of ChatGPT-like systems in clinical practice presents multiple benefits. These include improved efficiency in documentation review, enhanced accessibility of patient information, and potential reductions in clinician burnout by minimizing the time spent navigating unstructured EHRs [47]. Additionally, such systems could assist in patient engagement through AI-driven virtual assistants that provide general health-related guidance while ensuring that critical decisions remain clinician-led [47].

7. Challenges and Limitations

This study was a methodological evaluation and not a verification in an implementation protocol. The implementation protocol with key deliverables cannot be discussed comprehensively, but we acknowledge there may be challenges with upgrading the infrastructure and integration with other hospital information systems. We will explore this in the future.

Despite the fast evolution of LLMs and their NLP applications, any attempt to deploy ChatGPT in routine healthcare practice will face unprecedented ethical and legal barriers. There are notable challenges that must be addressed before full-scale adoption. Ethical considerations, including data security, bias in AI-generated text, and the risk of misinterpretation of AI summaries, must be thoroughly evaluated [32,55,56]. Ensuring transparency in decision support, defining the legal accountability of AI-assisted recommendations, and developing rigorous validation processes are necessary steps to mitigate risks associated with AI deployment in healthcare.

Several publicly available databases have been utilized for training NLP models in healthcare. MIMIC-III (Medical Information Mart for Intensive Care) is one of the most widely used datasets, containing de-identified clinical notes and structured health data from intensive care units, facilitating research in predictive modeling and medical NLP [57]. Another key resource is the n2c2 (formerly i2b2) clinical NLP challenge datasets, which provide annotated electronic health record (EHR) data for medical concept extraction, entity recognition, and relation identification tasks [58]. Additionally, UK-CRIS (Clinical Record Interactive Search) offers anonymized text data from NHS mental health trusts, enabling large-scale mental health research while ensuring compliance with privacy regulations [59]. The inclusion of such datasets enhances the development and validation of NLP models, supporting their application in real-world clinical environments.

8. Conclusions

Mental health EHR data processing involves large volumes of clinical notes. NLP methods offer new opportunities to explore the underlying unstructured data in these notes. Access to harmonized, anonymized data helps remove the barrier of exposing sensitive patient-identifiable data to non-clinical researchers. However, we can only conclude that theoretically our method is suitable for mental healthcare services, but to conclusively suggest the method is generalizable requires a technical validation study with multiple-site involvement. While ChatGPT-like models hold significant potential for transforming clinical workflows within the NHS, their adoption must be accompanied by robust security measures, domain-specific fine-tuning, and comprehensive ethical oversight. Future research should focus on evaluating real-world performance, integrating clinician feedback mechanisms, and ensuring that AI-generated insights align with clinical best practices and regulatory standards.

Author Contributions

Conceptualization, G.D and P.P.; writing—original draft preparation, G.D., Y.B., S.S., H.C. and P.P.; writing—review and editing, G.D., Y.B., S.S., H.C. and P.P.; visualization, P.P.; supervision, P.P.; project administration, H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study.

Acknowledgments

The authors acknowledge support from Kathryn Elliot and Donatella Fontana from Hampshire & Isle of Wight Healthcare NHS Foundation Trust (formerly Southern Health NHS Foundation Trust), as well as support from Southern University of Science and Technology, and University of Southampton.

Conflicts of Interest

The other authors report no conflicts of interest. The views expressed are those of the authors and not necessarily those of the NHS, the National Institute for Health Research, the Department of Health and Social Care, or the academic institutions.

Abbreviations

AI	artificial intelligence
ADEs	adverse drug events
AOBM	aspect-based opinion mining
BTS	biomedical text summarization
IE	information extraction
IR	information retrieval
EHR	electronic health records
GATE	general architecture for text engineering
LDA	latent Dirichlet allocation
NER	named-entity recognition
NHS	National Health Service
NLP	natural language processing
SHFT	Southern Health NHS Foundation Trust
UK	United Kingdom
UK-CRIS	UK Clinical Record Interactive Search

References

Vigo, D.; Thornicroft, G.; Atun, R. Estimating the true global burden of mental illness. Lancet Psychiatry 2016, 3, 171–178. [Google Scholar] [CrossRef] [PubMed]
NHS Digital. Mental Health Bulletin 2020–2021 Annual Report. 2020. Available online: https://digital.nhs.uk/data-and-information/publications/statistical/mental-health-bulletin/2020-21-annual-report (accessed on 14 October 2024).
Dorning, H.; Davies, A.; Blunt, I. Focus on: People with Mental Ill Health and Hospital Use. Exploring Disparities in Hospital Use for Physical Healthcare; The Nuffield Trust: London, UK, 2015. [Google Scholar]
Lin, J.; Jiao, T.; Biskupiak, J.E.; McAdam-Marx, C. Application of electronic medical record data for health outcomes research: A review of recent literature. Expert Rev. Pharmacoecon. Outcomes Res. 2013, 13, 191–200. [Google Scholar] [CrossRef] [PubMed]
Jensen, K.; Soguero-Ruiz, C.; Oyvind Mikalsen, K.; Lindsetmo, R.O.; Kouskoumvekaki, I.; Girolami, M.; Olav Skrovseth, S.; Augestad, K.M. Analysis of free text in electronic health records for identification of cancer patient trajectories. Sci. Rep. 2017, 7, 46226. [Google Scholar] [CrossRef] [PubMed]
Cohen, K.B.; Hunter, L. Natural language processing and systems biology. In Artificial Intelligence Methods and Tools for Systems Biology; Springer: Dordrecht, The Netherlands, 2004; pp. 147–173. [Google Scholar]
Locke, W.N.; Booth, A.D. (Eds.) Machine Translation of Languages; Fourteen essays; John Wiley: New York, NY, USA, 1955; ISBN 9780262120029. [Google Scholar]
Tenney, I.; Xia, P.; Chen, B.; Wang, A.; Poliak, A.; McCoy, R.T.; Kim, N.; Van Durme, B.; Bowman, S.R.; Das, D.; et al. What do you learn from context? Probing for sentence structure in contextualized word representations. arXiv 2019, arXiv:1905.06316. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar] [CrossRef]
Carlini, N.; Tramer, F.; Wallace, E.; Jagielski, M.; Herbert-Voss, A.; Lee, K.; Roberts, A.; Brown, T.; Song, D.; Erlingsson, U.; et al. Extracting training data from large language models. In Proceedings of the 30th USENIX Security Symposium, Vancouver, BC, Canada, 11–13 August 2021; pp. 2633–2650. [Google Scholar] [CrossRef]
Lawson, N.; Eustice, K.; Perkowitz, M.; Yetisgen-Yildiz, M. Annotating large email datasets for named entity recognition with mechanical turk. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, Los Angeles, CA, USA, 6 June 2010; pp. 71–79. [Google Scholar]
Lu, W.; Guttentag, A.; Elbel, B.; Kiszko, K.; Abrams, C.; Kirchner, T.R. Crowdsourcing for food purchase receipt annotation via amazon mechanical Turk: A feasibility study. J. Med. Int. Res. 2019, 21, e12047. [Google Scholar] [CrossRef] [PubMed]
Entzeridou, E.; Markopoulou, E.; Mollaki, V. Public and physician’s expectations and ethical concerns about electronic health record: Benefits outweigh risks except for information security. Int. J. Med. Inform. 2018, 110, 98–107. [Google Scholar] [CrossRef]
Kormilitzin, A.; Vaci, N.; Liu, Q.; Nevado-Holgado, A. Med7: A transferable clinical natural language processing model for electronic health records. Artif. Intell. Med. 2021, 118, 102086. [Google Scholar] [CrossRef]
Carson, L.; Jewell, A.; Downs, J.; Stewart, R. Multisite data linkage projects in mental health research. Lancet Psychiatry 2020, 7, e61. [Google Scholar] [CrossRef]
Botelle, R.; Bhavsar, V.; Kadra-Scalzo, G.; Mascio, A.; Williams, M.V.; Roberts, A.; Velupillai, S.; Stewart, R. Can natural language processing models extract and classify instances of interpersonal violence in mental healthcare electronic records: An applied evaluative study. BMJ Open 2022, 12, e052911. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
Rizzo, A.A.; Lange, B.; Buckwalter, J.G.; Forbell, E.; Kim, J.; Sagae, K.; Williams, J.; Rothbaum, B.O.; Difede, J.; Reger, G.; et al. An intelligent virtual human system for providing healthcare information and support. Stud. Health Technol. Inform. 2011, 163, 503–509. [Google Scholar] [CrossRef] [PubMed]
Bhakta, R.; Savin-Baden, M.; Tombs, G. Sharing Secrets with Robots? 2014. Available online: https://www.learntechlib.org/primary/p/147797/ (accessed on 14 October 2024).
Meeker, D.; Cerully, J.L.; Johnson, M.; Iyer, N.; Kurz, J.; Scharf, D.M. SimCoach evaluation: A virtual human intervention to encourage service-member help-seeking for posttraumatic stress disorder and depression. Rand Health Q. 2016, 5, 13. [Google Scholar]
McCoy, T.H.; Castro, V.M.; Cagan, A.; Roberson, A.M.; Kohane, I.S.; Perlis, R.H. Sentiment measured in hospital discharge notes is associated with readmission and mortality risk: An electronic health record study. PLoS ONE 2015, 10, e0136341. [Google Scholar] [CrossRef]
Denecke, K.; Deng, Y. Sentiment analysis in medical settings: New opportunities and challenges. Artif. Intell. Med. 2015, 64, 17–27. [Google Scholar] [CrossRef]
Martin, S.A.; Sinsky, C.A. The map is not the territory: Medical records and 21st century practice. Lancet 2016, 388, 2053–2056. [Google Scholar] [CrossRef]
Liu, Y.H.; Song, X.; Chen, S.F. Long story short: Finding health advice with informative summaries on health social media. Aslib J. Inf. Manag. 2019, 71, 821–840. [Google Scholar] [CrossRef]
Yin, Y.; Zhang, Y.; Liu, X.; Zhang, Y.; Xing, C.; Chen, H. HealthQA: A Chinese QA summary system for smart health. In Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2014; pp. 51–62. [Google Scholar] [CrossRef]
Wang, M.; Wang, M.; Yu, F.; Yang, Y.; Walker, J.; Mostafa, J. A systematic review of automatic text summarization for biomedical literature and EHRs. J. Am. Med. Inform. Assoc. 2021, 28, 2287–2297. [Google Scholar] [CrossRef]
Wu, P.H.; Yu, A.; Tsai, C.W.; Koh, J.L.; Kuo, C.C.; Chen, A.L. Keyword extraction and structuralization of medical reports. Health Inf. Sci. Syst. 2020, 8, 18. [Google Scholar] [CrossRef] [PubMed]
Tang, M.; Gandhi, P.; Kabir, M.A.; Zou, C.; Blakey, J.; Luo, X. Progress notes classification and keyword extraction using attention-based deep learning models with BERT. arXiv 2019, arXiv:1910.05786. [Google Scholar] [CrossRef]
Zeng, Z.; Deng, Y.; Li, X.; Naumann, T.; Luo, Y. Natural Language Processing for EHR-Based Computational Phenotyping. IEEE/ACM Trans. Comput. Biol. Bioinform. 2019, 16, 139–153. [Google Scholar] [CrossRef] [PubMed]
Wang, H.; Ding, Y.; Tang, J.; Dong, X.; He, B.; Qiu, J.; Wild, D.J. Finding complex biological relationships in recent PubMed articles using Bio-LDA. PLoS ONE 2011, 6, e17243. [Google Scholar] [CrossRef] [PubMed]
Gu, Y.; Leroy, G. Large-scale analysis of free-text data for mental health surveillance with topic modelling. In 26th Americas Conference on Information Systems, AMCIS 2020; Association for Information Systems: Atlanta, GA, USA, 2020; ISBN 9781733632546. [Google Scholar]
Fernandes, A.C.; Cloete, D.; Broadbent, M.; Hayes, R.D.; Chang, C.K.; Jackson, R.G.; Roberts, A.; Tsang, J.; Soncul, M.; Liebscher, J.; et al. Development and evaluation of a de-identification procedure for a case register sourced from mental health electronic records. BMC Med. Inform. Decis. Mak. 2013, 13, 71. [Google Scholar] [CrossRef] [PubMed]
Goodday, S.M.; Kormilitzin, A.; Vaci, N.; Liu, Q.; Cipriani, A.; Smith, T.; Nevado-Holgado, A. Maximizing the use of social and behavioural information from secondary care mental health electronic health records. J. Biomed. Inform. 2020, 107, 103429. [Google Scholar] [CrossRef]
Callard, F.; Broadbent, M.; Denis, M.; Hotopf, M.; Soncul, M.; Wykes, T.; Lovestone, S.; Stewart, R. Developing a new model for patient recruitment in mental health services: A cohort study using Electronic Health Records. BMJ Open 2014, 4, e005654. [Google Scholar] [CrossRef]
Walker, S.; Potts, J.; Martos, L.; Barrera, A.; Hancock, M.; Bell, S.; Geddes, J.; Cipriani, A.; Henshall, C. Consent to discuss participation in research: A pilot study. Evid.-Based Ment. Health 2020, 23, 77–82. [Google Scholar] [CrossRef]
Vaci, N.; Koychev, I.; Kim, C.H.; Kormilitzin, A.; Liu, Q.; Lucas, C.; Dehghan, A.; Nenadic, G.; Nevado-Holgado, A. Real-world effectiveness, its predictors and onset of action of cholinesterase inhibitors and memantine in dementia: Retrospective health record study. Br. J. Psychiatry 2021, 218, 261–267. [Google Scholar] [CrossRef]
Jackson, R.G.; Ball, M.; Patel, R.; Hayes, R.D.; Dobson, R.J.; Stewart, R. TextHunter—A User Friendly Tool for Extracting Generic Concepts from Free Text in Clinical Research. AMIA Annu. Symp. Proc. 2014, 2014, 729–738. [Google Scholar] [PubMed] [PubMed Central]
Iqbal, E.; Mallah, R.; Jackson, R.G.; Ball, M.; Ibrahim, Z.M.; Broadbent, M.; Dzahini, O.; Stewart, R.; Johnston, C.; Dobson, R.J. Identification of Adverse Drug Events from Free Text Electronic Patient Records and Information in a Large Mental Health Case Register. PLoS ONE 2015, 10, e0134208. [Google Scholar] [CrossRef] [PubMed]
Kadra, G.; Stewart, R.; Shetty, H.; Jackson, R.G.; Greenwood, M.A.; Roberts, A.; Chang, C.K.; MacCabe, J.H.; Hayes, R.D. Extracting antipsychotic polypharmacy data from electronic health records: Developing and evaluating a novel process. BMC Psychiatry 2015, 15, 166. [Google Scholar] [CrossRef]
Patel, R.; Jayatilleke, N.; Broadbent, M.; Chang, C.K.; Foskett, N.; Gorrell, G.; Hayes, R.D.; Jackson, R.; Johnston, C.; Shetty, H.; et al. Negative symptoms in schizophrenia: A study in a large clinical sample of patients using a novel automated method. BMJ Open 2015, 5, e007619. [Google Scholar] [CrossRef] [PubMed]
Patel, R.; Wilson, R.; Jackson, R.; Ball, M.; Shetty, H.; Broadbent, M.; Stewart, R.; McGuire, P.; Bhattacharyya, S. Cannabis use and treatment resistance in first episode psychosis: A natural language processing study. Lancet 2015, 385, S79. [Google Scholar] [CrossRef] [PubMed]
Taylor, C.L.; Stewart, R.; Ogden, J.; Broadbent, M.; Pasupathy, D.; Howard, L.M. The characteristics and health needs of pregnant women with schizophrenia compared with bipolar disorder and affective psychoses. BMC Psychiatry 2015, 15, 88. [Google Scholar] [CrossRef] [PubMed]
Downs, J.; Velupillai, S.; George, G.; Holden, R.; Kikoler, M.; Dean, H.; Fernandes, A.; Dutta, R. Detection of Suicidality in Adolescents with Autism Spectrum Disorders: Developing a Natural Language Processing Approach for Use in Electronic Health Records. AMIA Annu. Symp. Proc. 2018, 2017, 641–649. [Google Scholar] [PubMed]
Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
Devlin, J. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
Lee, J.; Yoon, W.; Kim, S.; Kim, D.; Kim, S.; So, C.H.; Kang, J. BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 2020, 36, 1234–1240. [Google Scholar] [CrossRef]
Gururangan, S.; Marasović, A.; Swayamdipta, S.; Lo, K.; Beltagy, I.; Downey, D.; Smith, N.A. Don’t stop pretraining: Adapt language models to domains and tasks. arXiv 2020, arXiv:2004.10964. [Google Scholar]
Wang, X.; Zhang, Y.; Ren, X.; Zhang, Y.; Jiang, J.; Han, J. An empirical study of pre-trained transformer models for biomedical text mining. ACM Trans. Comput. Biol. Bioinform. 2021, 18, 1432–1445. [Google Scholar]
Doshi-Velez, F.; Kim, B. Towards a rigorous science of interpretable machine learning. arXiv 2017, arXiv:1702.08608. [Google Scholar]
Bender, E.M.; Gebru, T.; McMillan-Major, A.; Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT’21), Athens, Greece, 3–10 March 2021. [Google Scholar]
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef]
Esteva, A.; Robicquet, A.; Ramsundar, B.; Kuleshov, V.; DePristo, M.; Chou, K.; Cui, C.; Corrado, G.S.; Thrun, S.; Dean, J. A guide to deep learning in healthcare. Nat. Med. 2021, 25, 24–29. [Google Scholar] [CrossRef] [PubMed]
Goh, G.B.; Hodas, N.O.; Vishnu, A. Deep learning for healthcare: Review, opportunities, and challenges. Brief. Bioinform. 2021, 22, 858–875. [Google Scholar] [CrossRef]
He, J.; Liu, Z.; Xia, Y.; Wang, J.; Zhang, X.; Liu, Y. Analyzing the potential of ChatGPT-like models in healthcare: Opportunities and challenges. J. Med. Internet Res. 2019, 21, e16279. [Google Scholar] [CrossRef]
Shen, Y.; Zhang, R.; Jiang, X.; Wang, J.; Liu, Y. Advances in Natural Language Processing for Clinical Text: Applications and Challenges. J. Biomed. Inform. 2021, 118, 103799. [Google Scholar] [CrossRef]
Qin, C.; Zhang, Z.; Chen, J.; Yasunaga, M.; Yang, D. Is chatgpt a general-purpose natural language processing task solver? arXiv 2023, arXiv:2302.06476. [Google Scholar]
Franciscu, S. ChatGPT: A Natural Language Generation Model for Chatbots. 2023. Available online: https://doi.org/10.13140/RG.2.2.24777.83044 (accessed on 14 October 2024).
Johnson, A.E.W.; Pollard, T.J.; Shen, L.; Lehman, L.-W.H.; Feng, M.; Ghassemi, M.; Moody, B.; Szolovits, P.; Celi, L.A.; Mark, R.G. MIMIC-III, a freely accessible critical care database. Sci. Data 2016, 3, 160035. [Google Scholar] [CrossRef]
Uzuner, O.; South, B.R.; Shen, S.; DuVall, S.L. 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inform. Assoc. 2011, 18, 552–556. [Google Scholar] [CrossRef]
Stewart, R.; Soremekun, M.; Perera, G.; Broadbent, M.; Callard, F.; Denis, M.; Hotopf, M.; Lovestone, S. The South London and Maudsley NHS Foundation Trust Biomedical Research Centre (SLAM BRC) case register: Development and descriptive data. BMC Psychiatry 2009, 9, 51. [Google Scholar] [CrossRef]

Figure 1. Distribution of care across primary, secondary, and tertiary care providers for mental health service settings in the UK.

Figure 2. Dataflow pipeline for secure and regulated access of EHR data for research purposes. (Trust—an organization that provides secondary health services within the English and Welsh National Health Service).

Figure 3. Workflow of governance approval process to access Akrivia Health data for research.

Table 1. Advantages and disadvantages/limitations of NLP-based approach to leverage free text in EHR records.

Advantages	Disadvantages/Limitations
Ability to extract relevant information from unstructured text [5,14].	Challenges processing free text include spelling and grammar errors, non-standard abbreviations and punctuation, acronyms, hedge phrases, and the variability in information recorded from practitioner to practitioner [5,14].
Ability to deal with large volume of data [11,15].	Need for large volume of training data to initially build models that are robust and can provide clinically meaningful outcomes [11,15].
Can save time, as NLP tools can help provide clinicians with easy and rapid access to relevant patient data [4,5].	NLP tools can miss out on some context-sensitive meanings and temporal relationships across sentences in a long clinical text. There is still a need for quality control and validation techniques to ensure practical usability of NLP outputs [4,9].
Can identify pertinent clinicopathological parameters of the diseases and key indicators of follow-up outcomes [8].	Issues with incomplete data, missing information, inconsistencies in describing disease status/staging within or between hospital sites. Differences in practice and followed guidance, and scoring systems are real examples [4,5,11].

Table 2. Diagram of the NLP architectures used in EHR data, indicating their main strengths and weaknesses. SNOMED, systematized nomenclature of medicine—clinical terms; UMLS, unified medical language system; LSTM, long short-term memory; GRU, gated recurrent unit; CNN, convolutional neural network; BERT, bidirectional encoder representations from transformers (BERT); and GPT, generative pretrained transformer.

Rule-Based NLP
Uses manually defined rules, regular expressions, and medical lexicons such as SNOMED CT and UMLS
Pros: High precision for structured tasks
Cons: Poor scalability, struggles with complex language patterns

Traditional Machine Learning (ML)
Employs statistical models (support vector machine, naïve Bayes, random forest)
Pros: Works well with small datasets
Cons: Requires domain-specific feature extraction

Deep Learning
Utilizes neutral networks (LSTM, GRU, CNN) for text processing
Pros: Captures complex patterns and sequential dependencies
Cons: Needs large labeled datasets and high computational power

Transformers
Advanced architectures like BERT, BioBERT, ClinicalBERT, and GPT
Pros: Context-aware, state-of-the-art performance
Cons: Computationally expensive, requires fine-tuning for medical applications

Hybrid Approaches
Combine rule-based NLP with deep learning for optimal accuracy
Pros: Leverages domain knowledge and data-driven insights
Cons: Integration complexity and computational overhead

Table 3. F1 scores for the medical concepts extracted from the clinical notes.

Concept	F1 Score
Symptoms and signs	0.80
Medication	0.94
Dosage	0.95
Mental health diagnosis	0.83
Sleep quality	0.85

Table 4. Studies that explored NLP tools on anonymized EHR data from mental health NHS trusts in the UK.

Study Title	Authors	NLP Method Used	Outcome/Validation
Development and evaluation of a de-identification procedure for a case register sourced from mental health electronic records	Fernandes et al. [31]	Pattern matching	De-identified psychiatric database sourced from HER, which protects patient anonymity and increases data availability for research.
TextHunter—A User Friendly Tool for Extracting Generic Concepts from Free Text in Clinical Research	Jackson et al. [36]	Concept extraction model	A tool for the creation of training data to construct concept extraction models. Confidence thresholds on accuracy measures like precision and recall were used for validation.
Identification of Adverse Drug Events from Free Text Electronic Patient Records and Information in a Large Mental Health Case Register	Iqbal et al. [37]	Text mining	Mined instances of adverse drug events (ADEs) related to antipsychotic therapy from free text content. The tool identified extrapyramidal side effects with >0.85 precision and >0.86 recall during testing.
Extracting antipsychotic polypharmacy data from electronic health records: developing and evaluating a novel process	Kadra et al. [38]	Named-entity extraction	Individual instances of antipsychotic prescribing and co-prescriptions were extracted from both structured and free text fields in EHR. Validity was assessed against a manually coded gold standard to establish precision and recall.
Negative symptoms in schizophrenia: a study in a large clinical sample of patients using a novel automated method	Patel et al. [39]	Text mining, Aspect based opinion mining	10 different negative symptoms were ascertained from the clinical records of patients with schizophrenia. Further, associations between demographic aspects (like age, gender, marital status) and hospitalization aspects (like likelihood of admission, readmission, length of admission) were determined.
Cannabis use and treatment resistance in first episode psychosis: a natural language processing study	Patel et al. [40]	Keyword extraction	Cannabis use as documented in free-text clinical records was identified and extraction to determine its association with hospital admissions.
The characteristics and health needs of pregnant women with schizophrenia compared with bipolar disorder and affective psychoses	Taylor et al. [41]	General Architecture for Text Engineering (GATE) software	Information on medication was extracted using structured indicators describing medication from free text for 3 months before and the first trimester of pregnancy. Two raters cross-checked 5 cases each week until satisfactory reliability was obtained and then a consecutive 22 cases (26 pregnancies) were independently rated for reliability analyses.
Detection of Suicidality in Adolescents with Autism Spectrum Disorders: Developing a Natural Language Processing Approach for Use in Electronic Health Records	Downs et al. [42]	Topic modeling	An NLP tool was developed to capture suicidality within clinical texts. Evaluation against human annotators using precision, recall, and F1 score was shown to be above 0.8.
Med7: a transferable clinical natural language processing model for electronic health records	Kormilitzin et al. [14]	Named-entity recognition	A model was trained to recognize attributes related to medication. Through transfer learning and fine tuning, the model was shown to achieve an F1 score of 0.944.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Delanerolle, G.; Bouchareb, Y.; Shetty, S.; Cavalini, H.; Phiri, P. A Pilot Study Using Natural Language Processing to Explore Textual Electronic Mental Healthcare Data. Informatics 2025, 12, 28. https://doi.org/10.3390/informatics12010028

AMA Style

Delanerolle G, Bouchareb Y, Shetty S, Cavalini H, Phiri P. A Pilot Study Using Natural Language Processing to Explore Textual Electronic Mental Healthcare Data. Informatics. 2025; 12(1):28. https://doi.org/10.3390/informatics12010028

Chicago/Turabian Style

Delanerolle, Gayathri, Yassine Bouchareb, Suchith Shetty, Heitor Cavalini, and Peter Phiri. 2025. "A Pilot Study Using Natural Language Processing to Explore Textual Electronic Mental Healthcare Data" Informatics 12, no. 1: 28. https://doi.org/10.3390/informatics12010028

APA Style

Delanerolle, G., Bouchareb, Y., Shetty, S., Cavalini, H., & Phiri, P. (2025). A Pilot Study Using Natural Language Processing to Explore Textual Electronic Mental Healthcare Data. Informatics, 12(1), 28. https://doi.org/10.3390/informatics12010028

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Pilot Study Using Natural Language Processing to Explore Textual Electronic Mental Healthcare Data

Abstract

1. Introduction

2. Natural Language Processing Pipeline

3. Overview of Key Methods and Architectures

4. Southern Health NHS Foundation Trust’s NLP Approach for UK-CRIS

5. Recent Advances in NLP Methods in Healthcare

6. Large-Scale Language Models in Healthcare

7. Challenges and Limitations

8. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI