Next Article in Journal
Nutrition in Perinatal Midwifery Care: A Narrative Review of RCTs, Current Practices, and Future Directions
Previous Article in Journal
Symptomatic Outcomes After Autologous Fat Grafting in Irradiated Postmastectomy Chest Wall
Previous Article in Special Issue
AI-Enhanced Qualitative Analysis in Healthcare: Unlocking Insight from Interviews of Leadership at Top-Performing Academic Medical Centers
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Systematic Review

A Systematic Review of Topic Modeling Techniques for Electronic Health Records

1
Department of Computer and Information Sciences, PIEAS, Lehtrar Road, Nilore, Islamabad 45650, Pakistan
2
Department of Artificial Intelligence, Faculty of Computer Science & Engineering, GIK Institute of Engineering Sciences and Technology, Topi 23640, Pakistan
3
Informatics and Computer Systems Department, King Khalid University, Abha 62521, Saudi Arabia
*
Author to whom correspondence should be addressed.
Healthcare 2026, 14(2), 282; https://doi.org/10.3390/healthcare14020282 (registering DOI)
Submission received: 13 December 2025 / Revised: 15 January 2026 / Accepted: 16 January 2026 / Published: 22 January 2026
(This article belongs to the Special Issue AI-Driven Healthcare Insights)

Abstract

Background: Electronic Health Records (EHRs) are a rich source of clinical information used for patient monitoring, disease progression analysis, and treatment outcome assessment. However, their large-scale, heterogeneity, and temporal characteristics make them difficult to analyze. Topic modeling has emerged as an effective method to extract latent structures, detect disease characteristics, and trace patient trajectories in EHRs. Recent neural and transformer-based approaches such as BERTopic has significantly improved coherence, scalability, and domain adaptability compared to earlier probabilistic models. Methods: This Systematic Literature Review (SLR) examines topic modeling and its variants applied to EHR data over the past decade. We follow the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework to identify, screen, and select relevant studies. The reviewed techniques span traditional probabilistic models, neural embedding-based methods, and temporal extensions designed for pathway and sequence modeling in clinical data. Results: The synthesis covers trends in publication patterns, dataset usage, application domains, and methodological contributions. The reviewed literature demonstrates strengths across different modeling families, while also highlighting challenges related to scalability, interpretability, temporal complexity, and privacy when analyzing large-scale EHRs. Conclusions: Topic modeling continues to play a central role in understanding temporal patterns and latent structures in EHRs. This review also outlines future possibilities for integrating topic modeling with Agentic AI and large language models to enhance clinical decision-making. Overall, this SLR provides researchers and practitioners with a consolidated foundation on temporal topic modeling in EHRs and its potential to advance data-driven healthcare.

1. Introduction

The growing trend in the digitization of healthcare has resulted in an unprecedented accumulation of Electronic Health Records (EHRs), which capture multifaceted aspects of patient care such as diagnoses, medications, laboratory tests, and clinical notes [1]. These EHRs have huge potential to fuel data-driven healthcare by permitting insights into disease trajectory, treatment response, and clinical pathways. They are complex in nature, high-dimensional and heterogeneously structured, and naturally occupy long temporal ranges. This makes it extremely difficult for classic statistical methods to extract useful patterns from such data at scale [2,3].
Topic modeling has emerged as a powerful set of methods that can identify hidden structures in EHRs [4,5,6]. First developed in the context of natural language processing, topic models were brought to healthcare to identify hidden disease phenotypes, determine patient subgroups, and map longitudinal clinical trajectories [7]. With time, research evolved from initial matrix and tensor factorization techniques to probabilistic models, like Latent Dirichlet Allocation (LDA) that would be able to model multi-modal healthcare data [5]. More recently, topic modeling has been extended to cover larger datasets using neural embedding-based models like neural variational models and transformer-based approaches [8]. Recently introduced temporal extensions of topic models have also allowed for the detection of changing clinical patterns, and hence are especially useful for pathway mining and longitudinal patient analysis [9] over time. This trajectory over time represents an ongoing effort to increase scalability, interpretability, and practical applicability to complex real-world healthcare environments [10].
The increasingly large body of work has put these various approaches to a variety of healthcare tasks, such as phenotype discovery, comorbidity analysis, patient risk stratification, and clinical decision support. Consistent with this advancement, the literature remains fragmented, with studies differing in methodology, datasets, and application scope. There is a trilemma involving scalability, clinical interpretability, and temporal depth in the field. As a result, there is still no agreement on the paradigms that are most appropriate for particular data forms or clinical operations. Moreover, as the field moves toward decentralized and privacy-sensitive settings, the trade-offs between the complexity of the model and the control over the data remain poorly understood. Thus, a unified SLR is needed to chart the field, distill findings, and underscore gaps that are still open for investigation. This SLR is unique in presenting a multidimensional comparative model that compares topic modeling paradigms to the EHR trilemma, which is an inherent trade-off between scalability, clinical interpretability, and temporal depth. In contrast to past surveys, we also analyzed which technique was most suitable for each specific dataset type. Also, our framework provides a comprehensive analysis regarding the performance of each method for topic modeling. In addition, our analysis focuses on the integration of Agentic AI with large language models (LLMs) by taking into consideration the gap between latent topic discovery and autonomous and clinician-centered decision support.
This SLR seeks to offer such a consolidation through the systematic examination of 79 out of 557 full-length peer-reviewed articles published between 2015 and 2025, which utilize topic modeling for the analysis of EHRs, till 30th September, 2025. Through this effort, we present a systematic analysis of methodological advancements, trend in publications, usage of datasets, and clinical applications. Furthermore, we synthesize strengths, weaknesses, future research directions, and lessons learned across topic modeling paradigms.
The primary contributions of this survey are as follows:
(i)
We provide an in-depth taxonomy of topic modeling techniques used in EHRs, covering probabilistic methods, matrix factorization methods, neural methods, transfer learning methods, and temporal extensions.
(ii)
We offer an extensive analysis of research findings, such as dataset used, evaluation measures, topic modeling technique, strengths, and limitations of these studies.
(iii)
We review the existing challenges to apply topic modeling in healthcare, e.g., scalability, interpretability, and data privacy, and present promising avenues for future research, including the integration of Agentic AI and large language models into clinical pathway analysis.

2. Methodology

This SLR aims at systematically reviewing the relevant literature of topic modeling techniques applied to EHRs using the PRISMA framework, and its protocol has been registered in the Open Science Framework database with the following DOI: 10.17605/OSF.IO/URPYG. We describe the approach used to identify, screen, and evaluate research for this SLR, including how search strings are formed, which keywords are employed, the search approach taken to retrieve the articles, the classification structure chosen, the distribution of papers and datasets, and the criteria used to compare the selected articles. We selected our desired literature content according to the strategy shown in Figure 1, which shows the layered architecture of topic modeling applied to EHRs for pattern discovery and clinical applications. The documents created from the EHRs should be preprocessed to be fed into some topic modeling technique such as Latent Dirichlet Allocation, temporal topic models or Non-negative Matrix Factorization, etc. The topic modeling technique extracts the topics from the documents, which can be used for further potential applications such as phenotyping clinical pathways.

2.1. Defining Research Questions

This SLR investigates the following research aspects: (a) What is the taxonomy of topic modeling techniques applied in EHRs? (b) What datasets are available for the research and development of EHR topic modeling techniques? (c) How are diverse topic modeling techniques applied in EHRs? (d) What are the research gaps, challenges, and future research directions in EHR topic modeling?
The objective of this SLR is to address a set of focused research questions that guide our analysis of topic modeling in EHRs:
  • RQ1: What are the predominant topic modeling approaches for EHRs and how can they be compared to each other?
  • RQ2: How have topic modeling methods evolved over time in the context of EHRs?
  • RQ3: What are the strengths and weaknesses of the existing studies in EHR systems?
  • RQ4: What are the challenges and future research directions in the field of topic modeling for EHRs?

2.2. Selecting Databases

Based on the formulated research questions, a search query was used in both the bibliographic databases, i.e., Scopus and Google Scholar. The choice of Scopus and Google Scholar was to have a wide capture of the multidisciplinary intersection of clinical medicine and computer science. Scopus, a formal repository, was used as the main one because of its better indexing of peer-reviewed journals in data mining and medical informatics. Google Scholar was included as a critical secondary source to capture the literature, conference papers, and articles available as early access, which may not yet be entirely indexed in curated repositories. Although some specific databases like PubMed and IEEE Xplore were first searched, a sensitivity analysis performed at the preliminary stage revealed that their results were fully covered by Scopus and Google Scholar. Treating these two platforms as a comprehensive union allowed us to have a high recall rate and remove duplicates that were redundant, so that the 79 articles selected constituted the entire range of the current state-of-the-art literature.
Each database returned different numbers of articles. The search was performed in September, 2025. An objective set of inclusion and exclusion criteria as shown in Table 1 was defined for retrieving articles. These criteria reflect the main objective of this SLR and help us to thematically focus on the most current and relevant articles to see how diverse topic modeling techniques are being used in the healthcare domain using EHRs, and to focus on available datasets in healthcare and the relevant performance metrics used.

2.3. Formulating Search Terms

To find the answers for the research questions posed in Section 1, relevant articles for topic modeling in EHRs were collected using the online databases Scopus (https://www.scopus.com/) and Google Scholar (https://scholar.google.com), as mentioned in Section 2.2. The PRISMA diagram in Figure 2 illustrates the selection and screening process for the articles included in this SLR.
Starting with “topic modeling” as the primary keyword. From this initial collection, we identified and chose more keywords to further expand the initial set of articles. Examples of these keywords include “topic modeling in EHRs”, “temporal topic modeling”, “clinical pathway analysis using NLP”, “EHR analysis using NLP”, “role of topic modeling in healthcare”, “transformers for topic modeling”, “classical approaches for EHR analysis”, and “Topic modeling in clinical pathway analysis”, according to the examples of initial search performed using the search strings listed in Table 2.

2.4. Applying Inclusion and Exclusion Criteria and Synthesizing Articles

In the identification phase of PRISMA, 557 articles were initially retrieved from both the databases. Using AI automation tools to remove the articles not in our scope, 129 articles were marked ineligible. We checked titles and abstracts and filtered an additional 302 articles on the basis of the exclusion criteria, like articles older than 2015 to avoid outdated knowledge, articles not written in English, type of study, not using EHRs, irrelevant content, and non-accessible full texts [11,12,13]. Eight article records were found to be duplicates. A total of 118 records were selected in the screening phase, where we studied whole articles to decide whether it should be included or not. We further filtered gray literature that includes technical reports, white papers, periodical statistical reports, etc., usually published by governmental institutions. Some studies did not use any topic models and only included theoretical work. We believe that such articles are not relevant to the scope of this SLR. At the end of the screening phase, as illustrated in Figure 2, a total of 118 papers were initially identified, of which 79 papers were finalized for detailed analysis in this SLR. The first three authors agreed to include only these articles. In addition to relevance, other factors were considered, such as completeness in terms of task definition, description of the proposed model or method, and presentation of results.

2.5. Classifying Topic Modeling Techniques for EHR

Based on the methodological families of topic modeling techniques employed to EHRs, as seen in Figure 3, we classified the selected techniques into the following five disjoint classes with associated technologies. Classical models such as clustering and process mining offer straightforward yet powerful ways to arrange highly dimensional EHR information first. Probabilistic topic models include LDA, CTM, STM, and clustering-based derivatives like K-means. These models provide clear statistical basis for hidden topic identification. Matrix employing tensor factorization approaches, the third family, expands dimensionality reduction by capturing multi-way linkages across patients, visits, and temporal slices. Using contextualized embeddings, group four, ETM and BERT, finds semantically rich patterns based on distributed representations using embedding-based models. Finally, temporal models particularly match the study of patient trajectories and clinical pathways because they explicitly represent topic evolution.

2.6. Publication and Dataset Distribution

To better understand the characteristics of the selected articles, we investigated the distribution of publications across several classifications, such as methods employed, type of publication, location, temporal patterns, and geographical distribution. These distributions explain the growth and scope of research in topic modeling techniques for EHRs.
Figure 4 highlights the relative share of each methodological family in this SLR literature. This breakdown is essential to assess which approaches dominate current research, and where underexplored opportunities remain. LDA was the leading choice of scientist in this field with 20 articles, whereas TTM, NTF, and ETM each were applied only two times. Two-thirds of the articles applied LDA, transformer variants, and LSA/CTM. Other articles used clustering, process mining, deep neural network, and NMF. A strong dominance of these LDA-based approaches signify their maturity, interpretability, and ease of use. In contrast, the sparsely explored embedding-based approaches highlight their limitations with respect to increased model complexity and data preprocessing challenges.
Similarly, Figure 5 shows the distribution of the articles into different types of publication, e.g., journals, conferences, thesis reports, and book chapters, reflecting the balance between mature contributions and emerging explorations. About seventy percent of the publications were journal articles while a few were theses, and only 2 of the 79 were about topic modeling in EHRs, which shows a considerable research gap for potential researchers in this promising field.
Figure 6 depicts the year-wise publication trend, illustrating the growing momentum of this field in recent years. Figure 7 identifies the most prominent journals and publishers, highlighting where this research is concentrated and which venues are shaping the discourse.
Figure 8 shows the geographic spread of contributions, reflecting global interest and the role of different healthcare systems in driving research priorities.
Table 3 summarizes the three primary dataset categories used across the 79 selected studies. The MIMIC dataset is used in 15 papers, UK Biobank dataset in 2 papers, and the remaining studies use various proprietary hospital EHRs aggregated here. The table also presents an analysis on the relationship between dataset type and the model it is best suitable for. There is an evident match between the data modality and methodology: neural models are best applied to rich text benchmarks in MIMIC. Matrix factorization is most efficient in UK Biobank, i.e., the dataset with structured and genomic sequences, and temporal innovation is propelled by proprietary EHRs because of its longitudinal depth.

2.7. Comparison Criteria

In this SLR, we used four criteria concerning methodological design, evaluation strategies, temporal aspects, and reported limitations to compare and synthesize the studies.
Techniques Used:The topic modeling techniques are classified into five methodological families, e.g., classical models, probabilistic topic models, matrix and tensor factorization models, embedding-based neural models, and temporal extensions. This comparison supports identifying the main techniques in use, revealing methodological diversity, and tracing the evolution of topic modeling from traditional statistical methods to neural and hybrid methods.
Performance Metrics: Different studies utilize multiple evaluation metrics. Metrics that rely on probabilities, such as perplexity and coherence, are for assessing topic quality, while task-based metrics would entail a predictive or classification setting with measures, e.g., accuracy, F1-score, AUC, or clustering validity indices.
Temporal Dynamics Integration: A major distinguishing feature of the studies is whether they capture temporal information. This gives a guiding view concerning the evolving states of the disease that really suit pathway analyses or longitudinal explorations of EHR data.
Limitations and Challenges: Finally, we discuss the constraints admitted in the studied works, including scalability issues with big EHR datasets, challenges in modeling complicated temporal dependencies, privacy concerns about using hospital records, and the restricted interpretive capacity of sophisticated neural techniques. Recognizing these difficulties offers a balanced viewpoint and helps to shape future research directions.

3. Topic Modeling Methods for Electronic Health Records

This section reviews the principal methodological families represented in the studies. The taxonomy used to organize these families is shown in Figure 3, and a historical perspective on method development is provided in Figure 9 [13], supported by the distribution of techniques used across the years as depicted by Figure 4 as well.
In the early 2000s, research was primarily driven by classical approaches such as clustering and process mining. These techniques focused on uncovering the patterns in textual data. After that, the research direction shifted towards matrix factorization techniques for topic modeling such as Non-negative Matrix Factorization (NMF) and Latent Semantic Analysis (LSA) techniques. These techniques enabled more structured and interpretable topic representations than the previous ones. During the course of the following decade, Latent Dirichlet Allocation and its extended versions, including the temporal models, were primary techniques for topic modeling. With the rise in transformers in 2017 and the years that followed, people have tried multiple approaches for topic modeling using transformer models to increase the semantics and contexts of their topics, reflecting the broader influence of deep learning on the field.

3.1. Classical Approaches

Classical approaches view patient records or document units as feature vectors, and attempt to uncover hidden groups via clustering or identify procedural processes using process mining techniques [14]. Common instantiations include k-means, hierarchical agglomerative clustering, density-based clustering, i.e., DBSCAN or HDBSCAN, and process mining methods such as alpha miner variations or heuristic miner and their more recent probabilistic or workflow-discovery extensions.
Technically, clustering algorithms work on an explicit representation X R n × d , whereby columns are TF–IDF, bag-of-codes, or embeddings, and each row is a document/patient attributes. K-means reduces within-cluster squared Euclidean distances.
min C , { μ k } i = 1 n x i μ C ( i ) 2 ,
Spectral or hierarchical clustering employs graph or linkage-based standards. Methods based on density search feature space for areas of high density and are insensitive to non-spherical clusters.
Process mining models work on event logs L = { ( p i , e i , 1 , , e i , T i ) } , where each trace is a time-ordered sequence of coded events; the goal is to deduce a process model, i.e., Petri net, transition graph, or probabilistic automaton, that accounts for observed behavior [15]. Noise filtering, frequency thresholds, and role or activity abstraction are frequently found in workflow discovery.
Usually applied following feature extraction in EHR applications, these techniques vectorized clinical notes, i.e., TF–IDF or embeddings, and encode structured events as categorical or temporal features. Clustering groups comparable topic mixtures, and process mining produces explicit pathways and transition probabilities for sequential studies, hence enhancing topic-based representations by offering cohort identification. Important hyperparameters included for process mining are number of clusters (K), distance metric, density thresholds, minimum support, and abstraction granularity.

3.2. Probabilistic Topic Modeling Approaches

Probabilistic topic models define documents or patient records as mixes over hidden themes and represent word generation via topic-specific word distributions. Using Dirichlet priors θ d Dir ( α ) for document–topic proportions and ϕ k Dir ( β ) for topic–word distributions, the archetypal model, Latent Dirichlet Allocation (LDA), is drawn by first sampling a topic z d , n Categorical ( θ d ) , and then w d , n Categorical ( ϕ z d , n ) . For large corpora, inference is usually carried out using collapsed Gibbs sampling, variational Bayes, or stochastic variational inference.
Correlated Topic Models (CTMs) capture topic correlations by replacing independent Dirichlet priors with logistic normal priors, and Structural Topic Models (STMs) use regression-style components to add document-level co-variates into topic prevalence and topical content. Supervised versions like sLDA or guided/seeded LDA introduce outcomes through generalized linear models, or supervise by labeling topic proportions [13].
EHR-specific instantiations usually use either clinical notes or concatenated notes per patient or code-sequences—documents per visit level codes as the input. Important modeling decisions include the unit of analysis, i.e., note-level, visit-level, or patient-level, vocabulary creation, i.e., words, clinical ideas, ICD/PheCodes, and their priors for sparsity and regularization. Essential hyperparameters include topic count K, Dirichlet concentration parameters α , β , and for STM or CTM, the form of covariates and their link functions. When issues are used as characteristics, evaluation employs intrinsic measurements such as topic coherence, perplexity, and downstream extrinsic activities, i.e., classification, clustering, and survival modeling.

3.3. Matrix and Tensor Factorization Approaches

Matrix factorization methods, e.g., Non-negative Matrix Factorization (NMF) and tensor decompositions generalize topic discovery to linear-algebraic latent factorization. NMF seeks non-negative matrices W R 0 n × r and H R 0 r × d such that X W H , where X is a term—either document or event-count matrix. Optimization is typically performed by alternating multiplicative updates or projected gradient methods minimizing an objective such as KL divergence or Frobenius norm.
Tensor methods—such as CP, PARAFAC, PARAFAC2, and Non-negative Tensor Factorization—extend factorization to multi-way arrays X R I × J × K and model X as a sum of rank-1 components X r = 1 R a r b r c r [16]. PARAFAC2 and related variants accommodate irregular mode sizes (e.g., varying visit counts per patient) and are therefore well suited for EHRs that exhibit irregular longitudinal structures. Optimization is commonly handled via alternating least-squares with non-negativity and sparsity constraints when required.
In practice, factorization approaches are used when the multi-aspect structure is explicit: patient × code × time tensors capture how latent factors (akin to topics) evolve across visits or cohorts. Factor matrices provide interpretable loadings: a factor’s code-loading vector resembles a topic’s word distribution, while the patient-loading vector indicates patient affiliation. Hyperparameters include rank r, sparsity regularization coefficients, and temporal smoothing penalties. Factorization methods are highly effective at capturing multi-way interactions, but they require careful initialization and regularization to avoid degeneracy and overfitting.

3.4. Embedding-Based and Neural Topic Models

Embedding-based approaches replace sparse bag-of-words representations with dense continuous vectors, either pretrained static embeddings, i.e., word2vec and GloVe, or contextualized transformer embeddings such as BERT-family models. Neural topic models such as the Embedding Topic Model (ETM) integrate word embeddings into the topic–word parameterization, often through neural networks that map embedding spaces to topic distributions. Contemporary pipelines, e.g., BERTopic, leverage sentence/document embeddings from transformers; this is followed by density-based clustering, i.e., HDBSCAN and class-based TF–IDF, to produce human-interpretable topic labels [17].
From a technical perspective, ETM parameterizes topic–word distributions as functions of word embeddings v w and topic embeddings t k , where ϕ k , w exp ( t k v w ) ; inference may proceed with amortized variational inference using neural encoders [18]. Transformer-based workflows embed each document d into vector u d and then apply clustering or mixture models to discover groups in embedding spaces; topics are generated by extracting representative words via class-based TF–IDF or by fitting lightweight probabilistic components on top of embeddings [19,20].
In EHRs, embedding-based methods improve semantic capture of clinical language—capturing abbreviations, handling negation, and enabling robust handling of short notes or fragmented text [21]. Implementation considerations include choice of embedding backbone like general BERT vs. ClinicalBERT, dimension reduction strategies for clustering, and techniques for producing human-interpretable topic labels from embeddings. Computational cost is higher than classical models, and reproducibility depends on embedding model versions and pretraining corpora.

3.5. Temporal Topic Modeling Approaches

Temporal models clearly describe how latent topics or states change over time [22]. Approaches in this category include Dynamic Topic Models (DTMs), topics-over-time, state-space extensions of topic models, Hidden Markov Models (HMMs) with topic emissions [23], and sequence models. They combine subject inference with RNNs or transformer-based temporal encoders. Mathematical formulas differ: DTMs apply temporal dynamics to topic parameters ϕ k , t by means of state evolution equations, e.g., Gaussian random walks in the natural-parameter space, whereas HMM-based variants assume a latent discrete state sequence s p , t for patient p, having transition matrix A and state-conditional emission models which are perhaps by themselves topic distributions.
Temporal modeling has to deal with unequal sampling, variable-length patient histories, and censorship for EHR data. Among the realistic fixes are time-binning (also called fixed-width window), and irregular-time models such as continuous-time variations, and missingness accommodation via data imputation or model enhancement. Temporal models are mostly used for pathway mining, clinical trajectory exploration, and modeling onset progression of illnesses. Key decisions in execution include the temporal granularity, state-space dimension for HMMs, smoothing strength for DTMs, and whether subject parameters evolve smoothly or may show sudden changes such as topic birth or topic death. Inference can be computationally demanding, e.g., SVI for DTM and forward–backward recursions for HMMs; hence, reliable initialization is imperative to prevent subpar local optima.

4. Findings for the Topic Modeling Methods in EHRs

In this section, we present all the comparative findings we found according to our study of all mentioned methodologies. Table 4 defines the legend used across the comparative analysis tables, reflecting the claims made in each reviewed study.

4.1. Classical Techniques

Studies consisting of any of the classical techniques of clustering, classification, process/sequence mining are classified in this category. Classical techniques [2,8,24,25,26,27,28] highlight how important their continuing relevance is in identifying clinical subgroups and treatment pathways. As summarized in Table 5, consensus clustering and patient-similarity methods showed ongoing ability to stratify varied groups, including sepsis or chronic disease cohorts. Similarly, process mining systems like MEDCP and fuzzy process mining detected recognizable treatment flows and captured workflow variety among many hospitals. Particularly, these approaches earned high scores on interpretability, often validated by clinical experts, hence drawing interest in situations where honesty is crucial.
The tables also emphasize their limitations [29]: heavy dependence on coding quality, site-specific preprocessing, and lesser ability to generalize across universities. Even when combined with transformers, e.g., TransformEHR, conventional methods had higher computation costs and required expert-intensive configuration. The findings show that although clustering and process path mining and verification still rely on mining, it is progressively being used with modern representation learning. These methods are expected to extend beyond predictive constraints and scalability problems.

4.2. Probabilistic Topic Modeling Techniques

Table 6 illustrates the comparison of probabilistic topic modeling techniques, classified in this category on the basis of any probabilistic technique used, such as LDA or any of its non-temporal variant, such as CTM, ETM, BTM, etc., and highlights their function as the fundamental techniques for EHR analysis, and their ongoing flexibility to various clinical tasks [3,6,9,10,17,21,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53]. Particularly for the exploratory research of clinical notes, coding data, and multi-modal EHRs, LDA and its Bayesian extensions still constitute the most widely utilized method. Consistently proving interpretability, these investigations helped to spot disease clusters, care themes, and phenotypic subtypes across environments, including MIMIC-III, dementia cohorts, and institutional cardiology databases. More modern models including MixEHR, MixEHR-S, MixEHR-Nest, and MixEHR-Guided extended the probabilistic approach to encompass multi-modal data, supervised results, and nested phenotypes. Likewise, specialized solutions like the Graph-Embedded Topic Model (G-ETM) and sequence-based probabilistic models emphasized Bayesian generative structures’ adaptability in integrating relational knowledge or sequential EHR dynamics. These techniques were very helpful for understanding phenotype detection, risk classification, outbreak detection, and supporting downstream prediction models overall.
Simultaneously, the results show continuous difficulties. Classical LDA approaches usually relied on qualitative expert labeling and proved sensitive to preprocessing and topic-count decisions, therefore restricting reproducibility. Although supervised or guided versions increased alignment with clinical outcomes, they brought complexity, reliance on curated priors, and expensive computational costs. Probabilistic models found problems with short-text collections, cross-institution generalizability, and management of temporal dependencies without clear extensions as well. Even hybrids including optimization techniques, e.g., Bayesian hyperparameter tuning, clustering integrations, and ChatGPT-3.5-assisted interpretation, underlined both the continuing relevance and natural limits of probabilistic models. Generally, these findings indicate that probabilistic models still anchor the field owing to their interpretability and theoretical basis [13], yet their long-term value progressively rests upon deliberate improvement with multi-modal, relational, and temporal elements to satisfy modern EHR analysis criteria.

4.3. Matrix and Tensor Factorization Techniques

Studies have been classified in this category on the basis of the presence of any factorization method such as NTF, NMF, LSI, LSA or any of their variants. As the comparison in Table 7 shows, NMF and tensor-based models [23,29,54,55,56,57,58,59] have been particularly successful in identifying hidden phenotypes and multimorbidity patterns from organized EHRs [60]. While static and temporal features were successfully combined using PARAFAC2-based methods, constrained tensor factorization caught changing cardiovascular disease phenotypes, for example. These approaches always produced understandable factor loadings, therefore enabling the detection of important subgroups and temporal patterns. These evaluations often reduced multiple-testing risks and improved clinical coherence, therefore indicating great benefit for hypothesis generation.
Furthermore, the review identified several limitations associated with factorization methods. Their performance is much influenced by rank selection, sparsity, and preprocessing; they often struggle to integrate text data or erratic event series. Another often cited limitation was computational cost, especially for large temporal tensors. Matrix and tensor methods are still strong for structured, multi-aspect data even if they need painstaking parameter tuning and usually benefit from hybridization with other methods for boosting scalability and generalization.

4.4. Embedding-Based and Neural Topic Modeling Techniques

Any study that utilized embeddings, mainly transformer-based or other neural-based approaches have been classified in this category. The synthesis in Table 8 reveals the quick post-2020 development of embedding-based and neural methods [7,13,16,20,22,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78]; recent research shows improved performance of transformer-based and deep representational models in specific comparative studies [19]. Techniques like Med-BERT, BEHRT, and ExBEHRT showed clear improvements in predictive accuracy and phenotyping quality especially in multi-modal and longitudinal EHRs, complementing the early attempts toward generative models, as well as the hybrid approaches that merged EMRs2CSP, GPT-style Foresight, and LLM-assisted clinical route modeling.
However, the tables also show major challenges: high computation requirements, constrained interpretability, and data access problems usually limit repeatability across sites. Many transformer-based models also need a lot of pretraining materials, hence lowering their worth in resource-poor conditions. Still, these methods represent the cutting edge approach as there is strong evidence that there is growing research interest in embedding-rich and sequence-aware methods for temporally driven EHR modeling.

4.5. Temporal Models

The studies addressing the techniques incorporating any temporal-based topic modeling such as TTM, HMM, etc., are classified in this category. The methods of temporal modeling [5,79,80,81,82,83,84] underlined in the comparison Table 9 stress their relevance for EHRs in the study of patient trajectories, disease progression, and pathway development. Probabilistic generative strong abilities for managing structured sequences, modeling comorbidity evolution, and producing interpretable latent subtypes were found in latent-state models and HMM-based methods [5]. While extensions such as latent treatment topic models supported customized pathways, HMMs applied to co-occurrence patterns specifically captured temporal structures while keeping computational feasibility. Forecasters and next-step clinical decision support. By specifically following the change in topic incidence over time, temporal topic models like DTM and TTM provided new insights by revealing emerging risk elements and therapy flows. For instance, applications to GP notes found pre-fall incident signals by changing topic trajectories; national-scale databases like N3C allowed for the uncovering of temporal evolution for millions of patients showing long-COVID traits.
Simultaneously, the constraints on these techniques are both constant and rather great. Many models depend on solid simplifying assumptions, for instance, Markov independence or discrete latent states, which may oversimplify the complex temporal dependencies inherent in longitudinal EHR data. Results were highly sensitive to preprocessing choices, including window size and temporal aggregation. Furthermore, the issues of missing data or aberrant sampling presented additional difficulties. Furthermore, in temporal extensions, computational demands rose significantly as opposed to static models. Though temporal methods are essential for pathway mining and illness development modeling, these results show that their dependability relies much on data quality and pretreatment thoroughness. Hybridizing temporal topic models with continuous-time representations or flexible neural sequence encoders could enable more accurate modeling of patient journeys and so overcome these limitations.

4.6. Hybrid Models

The techniques that used any combination of the abovementioned techniques, such as probabilistic and embedding-based approaches or classical process mining with transformers, are classified as hybrid. The hybrid approach [4,18,85,86] comparison in Table 10 reveals the growing tendency to meld traditional topic modeling with supporting machine learning techniques in an effort to balance interpretability and predictive accuracy. Methods like FKLSA (Fuzzy K-Means + LSA + PCA) improved topic accuracy and stability on noisy corpora by means of clustering and dimensionality reduction, therefore showing benefits on conventional baselines like straight LDA or LSA [38]. A recent study revealing the benefits of integrating LDA with BiLSTM designs discovered the benefits of combining readable latent themes with temporal sequence modeling. Stronger adherence to better predictive indicators like AUC, accuracy, and cost/time efficiency, supplied by these hybrid systems showed that fusion models can directly optimize healthcare systems.
The results also highlight creative extensions like KG-TM, which pairs knowledge graph embeddings with probabilistic topic modeling. By anchoring latent themes to curated medical ontologies, this approach enhanced topic coherence and enabled ontology-guided phenotyping [36], hence producing more clinically relevant latent themes. Through these hybrid methods, more modeling layers should increase the temporal, semantic, or relational capacity of topic models while maintaining their human interpretability. There are still several limitations, however: noisy topic inputs may propagate mistakes into the hybrid system; knowledge graph models need considerable curation and mapping Work; and tuning the interaction between elements sometimes proves to be tough. Although cross-institution generalizability and robustness still need improvement, these results imply a good middle ground in hybrid models.

5. Challenges and Future Directions

Even though topic modeling has proven to be an effective approach to obtain observations from EHRs, many cross-cutting problems still obstruct its full use in clinical practice [13]. Every one of these challenges offers possibilities for more research: scalability, interpretability, temporal modeling, integration with new artificial intelligence techniques, and privacy. Table 11 provides a systematic comparison of how these problems impact each of the topic modeling techniques discussed according to the analysis.
Several probabilistic and neural topic modeling methods investigated in this SLR show promising performance on standard datasets but struggle to scale when applied to real-world environment settings as EHR databases contain millions of entries. At the same time, interpretability of buried subjects remains a usual concern. Correct models using these techniques cannot be included in decision support without clinically significant depictions. For guaranteed clinical utility, future studies must therefore focus on developing scalable architectures that keep interpretability by combining human-in-the-loop validation with efficient variational inference [20].
Many temporal additions to topic modeling have enabled researchers to identify changing disease patterns and patient journeys. Still, there is difficulty in modeling the full temporal richness of EHRs. The reviewed works imply that better grasping of complex clinical pathways demands hybrid methods combining temporal topic models with sequential learning algorithms like recurrent or attention-based systems [8].
None of the examined studies explicitly link temporal topic modeling with agentic systems. So, future research could look at how LLMs could be used with topic modeling results for natural language. Recent advances in LLMs and the development of Agentic AI offer new possibilities to improve traditional topic modeling approaches [87,88,89] by extending their outputs beyond static analysis toward dynamic interpretation, interaction, and decision support. While current studies primarily employ topic modeling to identify latent themes, LLMs could be a powerful tool to add context, summary, and translation of these topics into clinically meaningful narratives which in turn can significantly improve interpretability and usability for healthcare professionals. There is clearly a route towards the development of interactive, flexible, and understandable clinical agents that could dynamically manage topic modeling workflows, validate topic evolution over time, cross-check findings against clinical guidelines, and present insights in clinician-friendly formats like natural language summaries of disease trajectories or treatment pathways. Furthermore, agentic systems could monitor incoming clinical data streams in a proactive manner, trigger re-analysis when topic shifts are detected, and support clinicians through early warnings and evidence-based suggestions. While these capabilities have not yet been realized in practice, they represent a promising avenue for advancing explainable, scalable, and clinician-centered decision support systems in healthcare.
A common challenge with patient data is the privacy concerns. EHR data are also very sensitive; several of the articles examined rely on few, publicly available datasets like MIMIC-III [36]. This restricts the range of assessments and so compromises the generalizability of the findings. Federated learning and privacy-preserving technologies provide promising ways to scale topic modeling across organizations without exposing raw patient information [83]. Wide clinical adoption will rely on addressing fairness, data governance, and repeatability in federated topic modeling.

6. Conclusions

Although EHRs provide great promise for promoting data-driven healthcare, their complexity, scope, and temporal depth make analysis challenging. An efficient means to find hidden patterns, group patients, and trace trajectories has emerged through topic modeling. Using the PRISMA framework, this SLR examined 79 publications between 2015 and 2025. We surveyed publication trends, database use, and application fields by means of a taxonomy, including probabilistic, matrix factorization, neural, temporal, and hybrid approaches. Our synthesis draws attention to ongoing hurdles while also stressing methodical advancements in data-driven healthcare analysis.
The most notable discovery of the analysis is the unsolved trade-off between interpretability and scalability. Although contemporary models represent temporal depth, clinician trust is hampered by their black-box character. As of now, no single strategy meets every need for real-time deployment. There is an urgent need for clinician-centered agentic systems which must replace pure optimization in the field. In order to ensure that models are naturally built for human-in-the-loop validation, the focus is on developing hybrid frameworks where LLMs convert complicated subjects into practical narratives. Looking ahead, significant opportunities lie in creating privacy-preserving, transparent, efficient, clinically based models and in connecting topic modeling with Agentic AI and LLMs to produce more engaging and useful clinical resources. This SLR offers practitioners and researchers a consolidated reference and a road map for promoting temporal topic modeling in healthcare analysis.

Funding

The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University for funding this work through Large Research Project under grant number RGP2/283/46.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Singh, A. Agentic AI in Healthcare: Diagnosis and Treatment. 2025. Available online: https://ssrn.com/abstract=5214492 (accessed on 17 September 2025).
  2. Chen, J.; Sun, L.; Guo, C.; Wei, W.; Xie, Y. A data-driven framework of typical treatment process extraction and evaluation. J. Biomed. Inform. 2018, 83, 178–195. [Google Scholar] [CrossRef] [PubMed]
  3. Cao, T.; Zhao, W.; Wu, H.; Giordano, T.; Karris, M.; Napravnik, S.; Whisenant, M.; Brady, V.; Burkholder, G.; Christopoulos, K.; et al. A New Approach to Discovering HIV Symptom and Patient Clusters Using CNICS Data and Topic Modeling. In Proceedings of the 2024 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), Houston, TX, USA, 10–13 November 2024; pp. 1–8. [Google Scholar]
  4. Rashid, J.; Shah, S.M.A.; Irtaza, A. A novel fuzzy k-means latent semantic analysis (FKLSA) approach for topic modeling over medical and health text corpora. J. Intell. Fuzzy Syst. 2019, 37, 6573–6588. [Google Scholar] [CrossRef]
  5. Zaballa, O.; Pérez, A.; Gómez-Inhiesto, E.; Acaiturri-Ayesta, T.; Lozano, J.A. A probabilistic generative model to discover the treatments of coexisting diseases with missing data. Comput. Methods Programs Biomed. 2024, 243, 107870. [Google Scholar] [CrossRef] [PubMed]
  6. Huang, Z.; Dong, W.; Duan, H. A probabilistic topic model for clinical risk stratification from electronic health records. J. Biomed. Inform. 2015, 58, 28–36. [Google Scholar] [CrossRef]
  7. Ma, L.; Chen, R.; Ge, W.; Rogers, P.; Lyn-Cook, B.; Hong, H.; Tong, W.; Wu, N.; Zou, W. AI-powered topic modeling: Comparing LDA and BERTopic in analyzing opioid-related cardiovascular risks in women. Exp. Biol. Med. 2025, 250, 10389. [Google Scholar]
  8. Askeli, S. Diagnostic Machine Learning Utilizing Text Mining and Supervised Classification in Inborn Errors of Immunity. Master’s Thesis, Perustieteiden Korkeakoulu, Otaniemi, Finland, 2024. [Google Scholar]
  9. Li, M.; Lee, K.; Liu, Z.; Ma, M.; Pan, Q.; Chen, R.; Schadt, E.; Wang, X. Applying Bayesian hyperparameter optimization towards accurate and efficient topic modeling in clinical notes. In Proceedings of the 2021 IEEE 9th International Conference on Healthcare Informatics (ICHI), Victoria, BC, Canada, 9–12 August 2021; pp. 493–494. [Google Scholar]
  10. Dinsa, E.F.; Das, M.; Abebe, T.U. A topic modeling approach for analyzing and categorizing electronic healthcare documents in Afaan Oromo without label information. Sci. Rep. 2024, 14, 32051. [Google Scholar] [CrossRef]
  11. Abramoff, M.D.; Whitestone, N.; Patnaik, J.L.; Rich, E.; Ahmed, M.; Husain, L.; Hassan, M.Y.; Tanjil, M.S.H.; Weitzman, D.; Dai, T.; et al. Autonomous artificial intelligence increases real-world specialist clinic productivity in a cluster-randomized trial. npj Digit. Med. 2023, 6, 184. [Google Scholar] [CrossRef]
  12. Amirahmadi, A.; Ohlsson, M.; Etminani, K. Deep learning prediction models based on EHR trajectories: A systematic review. J. Biomed. Inform. 2023, 144, 104430. [Google Scholar] [CrossRef]
  13. Miotto, R.; Li, L.; Kidd, B.A.; Dudley, J.T. Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records. Sci. Rep. 2016, 6, 26094. [Google Scholar] [CrossRef]
  14. Miao, B.Y.; Sushil, M.; Xu, A.; Wang, M.; Arneson, D.; Berkley, E.; Subash, M.; Vashisht, R.; Rudrapatna, V.; Butte, A.J. Characterisation of digital therapeutic clinical trials: A systematic review with natural language processing. Lancet Digit. Health 2024, 6, e222–e229. [Google Scholar] [CrossRef]
  15. Ma, J.; Zhang, Q.; Lou, J.; Xiong, L.; Bhavani, S.; Ho, J.C. Communication Efficient Tensor Factorization for Decentralized Healthcare Networks. In Proceedings of the 2021 IEEE International Conference on Data Mining (ICDM), Auckland, New Zealand, 7–10 December 2021; pp. 1216–1221. [Google Scholar]
  16. Huang, K.; Li, J.; Ranganath, R. ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission. 2019. Available online: https://api.semanticscholar.org/CorpusID:119308351 (accessed on 17 September 2025).
  17. Li, Y.; Nair, P.; Lu, X.H.; Wen, Z.; Wang, Y.; Dehaghi, A.A.K.; Miao, Y.; Liu, W.; Ordog, T.; Biernacka, J.M.; et al. Inferring multimodal latent topics from electronic health records. Nat. Commun. 2020, 11, 2536. [Google Scholar] [CrossRef]
  18. Zou, Y.; Pesaranghader, A.; Song, Z.; Verma, A.; Buckeridge, D.L.; Li, Y. Modeling electronic health record data using an end-to-end knowledge-graph-informed topic model. Sci. Rep. 2022, 12, 17868. [Google Scholar]
  19. Garriga, R.; Buda, T.S.; Guerreiro, J.; Iglesias, J.O.; Aguerri, I.E.; Matić, A. Combining clinical notes with structured electronic health records enhances the prediction of mental health crises. Cell Rep. Med. 2023, 4, 101260. [Google Scholar] [CrossRef]
  20. Rajkomar, A.; Oren, E.; Chen, K.; Dai, A.M.; Hajaj, N.; Hardt, M.; Liu, P.J.; Liu, X.; Marcus, J.; Sun, M.; et al. Scalable and accurate deep learning with electronic health records. npj Digit. Med. 2018, 1, 18. [Google Scholar] [CrossRef] [PubMed]
  21. Sin, C.; Yip, M. Characterizing Long COVID Patients for Enhanced Clinical Pathways: An Application of Clustering and Topic Modeling to Electronic Health Records. Master’s Thesis, University of Toronto, Toronto, ON, Canada, 2023. [Google Scholar]
  22. Ruan, X.; Lu, S.; Wang, L.; Li, L.; Wen, A.; Murali, S.; Liu, H. Deep Phenotyping of Obesity: Electronic Health Record–Based Temporal Modeling Study. J. Med. Internet Res. 2025, 27, e70140. [Google Scholar] [CrossRef] [PubMed]
  23. Afshar, A.; Perros, I.; Park, H.; deFilippi, C.; Yan, X.; Stewart, W.; Ho, J.; Sun, J. TASTE: Temporal and static tensor factorization for phenotyping electronic health records. In Proceedings of the ACM Conference on Health, Inference, and Learning, Toronto, ON, Canada, 2–4 April 2020; pp. 193–203. [Google Scholar]
  24. Seymour, C.; Kennedy, J.; Wang, S.; Chang, C.-C.; Elliott, C.; Xu, Z.; Berry, S.; Clermont, G.; Cooper, G.; Gómez, H.; et al. Derivation, Validation, and Potential Treatment Implications of Novel Clinical Phenotypes for Sepsis. JAMA 2019, 321, 2003–2017. [Google Scholar] [CrossRef] [PubMed]
  25. Chen, J.; Guo, C.; Sun, L.; Lu, M. Mining Typical Drug Use Patterns Based on Patient Similarity from Electronic Medical Records. In Knowledge and Systems Sciences; Chen, J., Yamada, Y., Ryoke, M., Tang, X., Eds.; Springer: Singapore, 2018; pp. 71–86. [Google Scholar]
  26. Vathy-Fogarassy, Á.; Vassányi, I.; Kósa, I. Multi-level process mining methodology for exploring disease-specific care processes. J. Biomed. Inform. 2022, 125, 103979. [Google Scholar] [CrossRef]
  27. Kurniati, A.P.; Wisudiawan, G.A.A.; Kusuma, G.P.; Saadah, S.; Osman, N.A.; Zulhelmy; Wan, W.N.S.B.; Hafidz, F. Patient Clustering to Improve Process Mining for Disease Trajectory Analysis Using Indonesia Health Insurance Dataset. In Proceedings of the 2024 7th International Conference on Artificial Intelligence and Big Data (ICAIBD), Chengdu, China, 24–27 May 2024; pp. 88–93. [Google Scholar]
  28. Shao, Y.; Morris, R.S.; Bray, B.E.; Zeng-Treitler, Q. Topic Modeling Based on ICD Codes for Clinical Documents. In Intelligent Systems and Applications; Arai, K., Ed.; Springer International Publishing: Cham, Switzerland, 2022; pp. 184–198. [Google Scholar]
  29. Meaney, C.; Escobar, M.; Stukel, T.A.; Austin, P.C.; Jaakkimainen, L. Comparison of Methods for Estimating Temporal Topic Models from Primary Care Clinical Text Data: Retrospective Closed Cohort Study. JMIR Med. Inform. 2022, 10, e40102. [Google Scholar] [CrossRef]
  30. Pérez, J.; Pérez, A.; Casillas, A.; Gojenola, K. Cardiology record multi-label classification using latent Dirichlet allocation. Comput. Methods Programs Biomed. 2018, 164, 111–119. [Google Scholar] [CrossRef]
  31. Wang, L.; Lakin, J.; Riley, C.; Korach, T.; Frain, L.; Zhou, L. Disease Trajectories and End-of-Life Care for Dementias: Latent Topic Modeling and Trend Analysis Using Clinical Notes. AMIA Annu. Symp. Proc. 2018, 2018, 1056. [Google Scholar]
  32. Bagheri, A.; Sammani, A.; van der Heijden, P.G.M.; Asselbergs, F.W.; Oberski, D.L. ETM: Enrichment by topic modeling for automated clinical sentence classification to detect patients’ disease history. J. Intell. Inf. Syst. 2020, 55, 329–349. [Google Scholar] [CrossRef]
  33. Puerari, I.; Duarte, D.; Bianco, G.D.; Lima, J.F. Exploratory Analysis of Electronic Health Records using Topic Modeling. J. Inf. Data Manag. 2021, 11, 131–147. [Google Scholar] [CrossRef]
  34. Bhattacharya, M.; Jurkovitz, C.; Shatkay, H. Identifying patterns of associated-conditions through topic models of Electronic Medical Records. In Proceedings of the 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Shenzhen, China, 15–18 December 2016; pp. 466–469. [Google Scholar]
  35. Wang, Y.; Grant, A.V.; Li, Y. Implementation of a graph-embedded topic model for analysis of population-level electronic health records. STAR Protoc. 2023, 4, 101966. [Google Scholar] [PubMed]
  36. Kang, H.; Yu, Z.; Gong, Y. Initializing and Growing a Database of Health Information Technology (HIT) Events by Using TF-IDF and Biterm Topic Modeling. AMIA Annu. Symp. Proc. 2018, 2017, 1024–1033. [Google Scholar]
  37. D’Souza, E.W.; MacGregor, A.J.; Markwald, R.R.; Elkins, T.A.; Zouris, J.M. Investigating insomnia in United States deployed military forces: A topic modeling approach. Sleep Health 2024, 10, 75–82. [Google Scholar]
  38. Wu, P.; Xu, T.; Wang, Y. Learning Personalized Treatment Rules from Electronic Health Records Using Topic Modeling Feature Extraction. In Proceedings of the 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA), Washington, DC, USA, 5–8 October 2019; pp. 392–402. [Google Scholar]
  39. Martinis, M.C.; Amodeo, A.; Facente, V.; Greco, F.; Zucco, C. Leveraging Topic Modeling in the Analysis of Urology Medical Reports. In Proceedings of the 2024 Fourth International Conference on Digital Data Processing (DDP), New York, NY, USA, 25–27 September 2024; pp. 118–122. [Google Scholar]
  40. Wen, Z.; Nair, P.; Deng, C.-Y.; Lu, X.H.; Moseley, E.; George, N.; Lindvall, C.; Li, Y. Mining heterogeneous clinical notes by multi-modal latent topic model. PLoS ONE 2021, 16, e0249622. [Google Scholar]
  41. Kondratieff, K.E.; Brown, J.T.; Barron, M.; Warner, J.L.; Yin, Z. Mining Medication Use Patterns from Clinical Notes for Breast Cancer Patients Through a Two-Stage Topic Modeling Approach. AMIA Summits Transl. Sci. Proc. 2022, 2022, 303–312. [Google Scholar]
  42. Agarwal, A.; Banerjee, T.; Romine, W.L.; Thirunarayan, K.; Chen, L.; Cajita, M. Mining Themes in Clinical Notes to Identify Phenotypes and to Predict Length of Stay in Patients admitted with Heart Failure. In Proceedings of the 2023 IEEE International Conference on Digital Health (ICDH), Chicago, IL, USA, 2–8 July 2023; pp. 208–216. [Google Scholar]
  43. Ahuja, Y.; Zou, Y.; Verma, A.; Buckeridge, D.; Li, Y. MixEHR-Guided: A guided multi-modal topic modeling approach for large-scale automatic phenotyping using the electronic health record. J. Biomed. Inform. 2022, 134, 104190. [Google Scholar]
  44. Wang, R.; Wang, Z.; Song, Z.; Buckeridge, D.; Li, Y. MixEHR-Nest: Identifying Subphenotypes within Electronic Health Records through Hierarchical Guided-Topic Modeling. In Proceedings of the 15th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, Shenzhen, China, 22–25 November 2024. Article No. 53. [Google Scholar]
  45. Li, Y.; Yang, A.Y.; Marelli, A.; Li, Y. MixEHR-SurG: A joint proportional hazard and guided topic model for inferring mortality-associated topics from electronic health records. J. Biomed. Inform. 2024, 153, 104638. [Google Scholar]
  46. Chen, J.H.; Goldstein, M.K.; Asch, S.M.; Mackey, L.; Altman, R.B. Predicting inpatient clinical order patterns with probabilistic topic models vs conventional order sets. J. Am. Med. Inform. Assoc. 2017, 24, 472–480. [Google Scholar]
  47. Lebeña, N.; Blanco, A.; Pérez, A.; Casillas, A. Preliminary exploration of topic modelling representations for Electronic Health Records coding according to the International Classification of Diseases in Spanish. Expert Syst. Appl. 2022, 204, 117303. [Google Scholar] [CrossRef]
  48. Song, Z.; Sumba, X.T.; Xu, Y.; Liu, A.; Guo, L.; Powell, G.; Verma, A.; Buckeridge, D.; Marelli, A.; Li, Y. Supervised multi-specialist topic model with applications on large-scale electronic health record data. In Proceedings of the 12th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, Gainesville, FL, USA, 1–4 August 2021. Article No. 6. [Google Scholar]
  49. Ramon-Gonen, R.; Dori, A.; Shelly, S. Towards a practical use of text mining approaches in electrodiagnostic data. Sci. Rep. 2023, 13, 19483. [Google Scholar] [CrossRef] [PubMed]
  50. Rijcken, E.; Scheepers, F.; Zervanou, K.; Spruit, M.; Mosteiro, P.; Kaymak, U. Towards Interpreting Topic Models with ChatGPT. In Proceedings of the IFSA 2023 Conference, Denver, CO, USA, 25–29 September 2023. [Google Scholar]
  51. Wang, Y.; Zhao, Y.; Therneau, T.M.; Atkinson, E.J.; Tafti, A.P.; Zhang, N.; Amin, S.; Limper, A.H.; Khosla, S.; Liu, H. Unsupervised machine learning for the discovery of latent disease clusters and patient subgroups using electronic health records. J. Biomed. Inform. 2020, 102, 103364. [Google Scholar] [CrossRef] [PubMed]
  52. Kaplan, A.D.; Greene, J.D.; Liu, V.X.; Ray, P. Unsupervised probabilistic models for sequential Electronic Health Records. J. Biomed. Inform. 2022, 134, 104163. [Google Scholar] [CrossRef] [PubMed]
  53. Noble, P.J.M.; Appleton, C.; Radford, A.D.; Nenadic, G. Using topic modelling for unsupervised annotation of electronic health records to identify an outbreak of disease in UK dogs. PLoS ONE 2021, 16, e0260402. [Google Scholar] [CrossRef]
  54. Zhao, J.; Zhang, Y.; Schlueter, D.J.; Wu, P.; Kerchberger, V.E.; Rosenbloom, S.T.; Wells, Q.S.; Feng, Q.P.; Denny, J.C.; Wei, W.Q. Detecting time-evolving phenotypic topics via tensor factorization on electronic health records: Cardiovascular disease case study. J. Biomed. Inform. 2019, 98, 103270. [Google Scholar] [CrossRef]
  55. Kim, Y.; El-Kareh, R.; Sun, J.; Yu, H.; Jiang, X. Discriminative and Distinct Phenotyping by Constrained Tensor Factorization. Sci. Rep. 2017, 7, 1114. [Google Scholar] [CrossRef]
  56. Karami, A.; Gangopadhyay, A.; Zhou, B.; Kharrazi, H. Fuzzy Approach Topic Discovery in Health and Medical Corpora. Int. J. Fuzzy Syst. 2018, 20, 1334–1345. [Google Scholar] [CrossRef]
  57. Hassaine, A.; Canoy, D.; Solares, J.R.A.; Zhu, Y.; Rao, S.; Li, Y.; Zottoli, M.; Rahimi, K.; Salimi-Khorshidi, G. Learning multimorbidity patterns from electronic health records using Non-negative Matrix Factorisation. J. Biomed. Inform. 2020, 112, 103606. [Google Scholar] [CrossRef]
  58. Malakouti, S.; Hauskrecht, M. Predicting Patient’s Diagnoses and Diagnostic Categories from Clinical-Events in EHR Data. In Artificial Intelligence in Medicine; Riaño, D., Wilk, S., ten Teije, A., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 125–130. [Google Scholar]
  59. Zhao, J.; Feng, Q.; Wu, P.; Warner, J.L.; Denny, J.C.; Wei, W.-Q. Using topic modeling via non-negative matrix factorization to identify relationships between genetic variants and disease phenotypes: A case study of Lipoprotein (a) (LPA). PLoS ONE 2019, 14, e0212112. [Google Scholar] [CrossRef]
  60. Roosan, D.; Khan, R.; Essien-Aleksi, I.; Nirzhor, S.; Hai, F. Empowering Clinicians with an Agentic AI for Voice-Driven EHR Exploration. 2025. Available online: https://aisel.aisnet.org/pacis2025 (accessed on 17 September 2025).
  61. Li, J.; Namvar, M.; Akhlaghpour, S.; Indulska, M. Constructing Patient Representation through Semi-Supervised Topic Modeling. 2024. Available online: https://aisel.aisnet.org/pacis2024/track11_healthit/track11_healthit/13 (accessed on 17 September 2025).
  62. Li, Y.; Rao, S.; Solares, J.R.A.; Hassaine, A.; Ramakrishnan, R.; Canoy, D.; Zhu, Y.; Rahimi, K.; Salimi-Khorshidi, G. BEHRT: Transformer for Electronic Health Records. Sci. Rep. 2020, 10, 7155. [Google Scholar] [CrossRef]
  63. Meng, Y.; Speier, W.; Ong, M.K.; Arnold, C.W. Bidirectional Representation Learning From Transformers Using Multimodal Electronic Health Record Data to Predict Depression. IEEE J. Biomed. Health Inform. 2021, 25, 3121–3129. [Google Scholar] [CrossRef] [PubMed]
  64. Qiu, J.; Hu, Y.; Li, L.; Erzurumluoglu, A.M.; Braenne, I.; Whitehurst, C.; Schmitz, J.; Arora, J.; Bartholdy, B.A.; Gandhi, S.; et al. Deep representation learning for clustering longitudinal survival data from electronic health records. Nat. Commun. 2025, 16, 2534. [Google Scholar] [CrossRef] [PubMed]
  65. Saigaonkar, S.; Narawade, V. Domain adaptation of transformer-based neural network model for clinical note classification in Indian healthcare. Int. J. Inf. Technol. 2024, 16, 1–19. [Google Scholar] [CrossRef]
  66. Chen, J.; Cutrona, S.L.; Dharod, A.; Moses, A.; Bridges, A.; Ostasiewski, B.; Foley, K.L.; Houston, T.K. iDAPT Implementation Science Center for Cancer Control. Electronic health record activity changes around new decision support implementation: Monitoring using audit logs and topic modeling. JAMIA Open 2025, 8, ooaf050. [Google Scholar] [CrossRef]
  67. Rupp, M.; Peter, O.; Pattipaka, T. ExBEHRT: Extended Transformer for Electronic Health Records. In Trustworthy Machine Learning for Healthcare; Chen, H., Luo, L., Eds.; Springer Nature: Cham, Switzerland, 2023; pp. 73–84. [Google Scholar]
  68. Kraljevic, Z.; Bean, D.; Shek, A.; Bendayan, R.; Hemingway, H.; Yeung, J.A.; Deng, A.; Baston, A.; Ross, J.; Idowu, E.; et al. Foresight—A generative pretrained transformer for modelling of patient timelines using electronic health records: A retrospective modelling study. Lancet Digit. Health 2024, 6, e281–e290. [Google Scholar] [CrossRef]
  69. Meng, Y.; Speier, W.; Ong, M.; Arnold, C.W. HCET: Hierarchical clinical embedding with topic modeling on electronic health records for predicting future depression. IEEE J. Biomed. Health Inform. 2021, 25, 1124–1133. [Google Scholar]
  70. Huang, T.; Rizvi, S.A.; Thakur, R.K.; Socrates, V.; Gupta, M.; van Dijk, D.; Taylor, R.A.; Ying, R. HEART: Learning better representation of EHR data with a heterogeneous relation-aware transformer. J. Biomed. Inform. 2024, 159, 104741. [Google Scholar] [CrossRef]
  71. Li, Y.; Mamouei, M.; Salimi-Khorshidi, G.; Rao, S.; Hassaine, A.; Canoy, D.; Lukasiewicz, T.; Rahimi, K. Hi-BEHRT: Hierarchical Transformer-Based Model for Accurate Prediction of Clinical Events Using Multimodal Longitudinal Electronic Health Records. IEEE J. Biomed. Health Inform. 2023, 27, 1106–1117. [Google Scholar] [CrossRef]
  72. Wen, J.; Hou, J.; Bonzel, C.L.; Zhao, Y.; Castro, V.M.; Weisenfeld, D.; Cai, T.; Ho, Y.L.; Panickan, V.A.; Costa, L.; et al. LATTE: Label-efficient incident phenotyping from longitudinal electronic health records. Patterns 2024, 5, 100906. [Google Scholar]
  73. Rasmy, L.; Xiang, Y.; Xie, Z.; Tao, C.; Zhi, D. Med-BERT: Pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. npj Digit. Med. 2021, 4, 86. [Google Scholar] [CrossRef]
  74. Lee, J.M.; Hauskrecht, M. Modeling multivariate clinical event time-series with recurrent temporal mechanisms. Artif. Intell. Med. 2021, 112, 102021. [Google Scholar] [CrossRef] [PubMed]
  75. Meng, Y.; Speier, W.; Ong, M.; Arnold, C.W. Multi-Level Embedding with Topic Modeling on Electronic Health Records for Predicting Depression. In Explainable AI in Healthcare and Medicine: Building a Culture of Transparency and Accountability; Shaban-Nejad, A., Michalowski, M., Buckeridge, D.L., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 241–246. [Google Scholar]
  76. Nogues, I.E.; Wen, J.; Zhao, Y.; Bonzel, C.L.; Castro, V.M.; Lin, Y.; Xu, S.; Hou, J.; Cai, T. Semi-supervised Double Deep Learning Temporal Risk Prediction (SeDDLeR) with Electronic Health Records. J. Biomed. Inform. 2024, 157, 104685. [Google Scholar] [CrossRef] [PubMed]
  77. Yang, Z.; Mitra, A.; Liu, W.; Berlowitz, D.; Yu, H. TransformEHR: Transformer-based encoder-decoder generative model to enhance prediction of disease outcomes using electronic health records. Nat. Commun. 2023, 14, 7857. [Google Scholar] [PubMed]
  78. Silva, R.P.; Pollettini, J.T.; Pazin Filho, A. Unsupervised natural language processing in the identification of patients with suspected COVID-19 infection. Cad. Saude Publica 2023, 39, e00243722. [Google Scholar]
  79. O’Neil, S.T.; Madlock-Brown, C.; Wilkins, K.J.; McGrath, B.M.; Davis, H.E.; Assaf, G.S.; Wei, H.; Zareie, P.; French, E.T.; Loomba, J.; et al. Finding Long-COVID: Temporal topic modeling of electronic health records from the N3C and RECOVER programs. npj Digit. Med. 2024, 7, 296. [Google Scholar]
  80. Kwon, B.C.; Achenbach, P.; Dunne, J.L.; Hagopian, W.; Lundgren, M.; Ng, K.; Veijola, R.; Frohnert, B.I.; Anand, V. Modeling Disease Progression Trajectories from Longitudinal Observational Data. Annu. Symp. Proc. 2021, 2020, 668. [Google Scholar]
  81. Zhang, Y.; Zhang, Y.; Wang, H. Patient Subtyping via Learning Hidden Markov Models from Pairwise Co-occurrences in EHR Data. In Proceedings of the 2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, FL, USA, 15–19 July 2024; pp. 1–4. [Google Scholar]
  82. Huang, Z.; Ge, Z.; Dong, W.; He, K.; Duan, H. Probabilistic modeling personalized treatment pathways using electronic health records. J. Biomed. Inform. 2018, 86, 33–48. [Google Scholar] [CrossRef]
  83. Li, W.; Min, X.; Ye, P.; Xie, W.; Zhao, D. Temporal topic model for clinical pathway mining from electronic medical records. BMC Med. Inform. Decis. Mak. 2024, 24, 20. [Google Scholar]
  84. Dormosh, N.; Abu-Hanna, A.; Calixto, I.; Schut, M.C.; Heymans, M.W.; van der Velde, N. Topic evolution before fall incidents in new fallers through natural language processing of general practitioners’ clinical notes. Age Ageing 2024, 53, afae016. [Google Scholar] [CrossRef]
  85. Yang, X.; Huang, W.; Zhao, W.; Zhou, X.; Shi, N.; Xia, Q. Exploring Acute Pancreatitis Clinical Pathways Using a Novel Process Mining Method. Healthcare 2023, 11, 2529. [Google Scholar] [CrossRef]
  86. Tianzhao, L.; Jinzhi, H.; Rong, Z.; Jun, S.; Hailong, L.; Yan, L. Revolutionizing clinical decision making through deep learning and topic modeling for pathway optimization. Sci. Rep. 2025, 15, 28787. [Google Scholar] [CrossRef]
  87. Wang, Q.; Wang, Z.; Li, M.; Ni, X.; Tan, R.; Zhang, W.; Wubulaishan, M.; Wang, W.; Yuan, Z.; Zhang, Z.; et al. A feasibility study of automating radiotherapy planning with large language model agents. Phys. Med. Biol. 2025, 70, 075007. [Google Scholar] [CrossRef]
  88. Genovese, A.; Borna, S.; Gomez-Cabello, C.A.; Haider, S.A.; Prabha, S.; Forte, A.J.; Veenstra, B.R. Artificial intelligence in clinical settings: A systematic review of its role in language translation and interpretation. Ann. Transl. Med. 2024, 12, 117. [Google Scholar] [CrossRef]
  89. Wenk, J.; Voigt, I.; Inojosa, H.; Schlieter, H.; Ziemssen, T. Building digital patient pathways for the management and treatment of multiple sclerosis. Front. Immunol. 2024, 15, 1356436. [Google Scholar] [CrossRef]
Figure 1. Architecture of the survey from raw EHRs to topic modeling techniques, pattern discovery, and clinical applications of extracted topics.
Figure 1. Architecture of the survey from raw EHRs to topic modeling techniques, pattern discovery, and clinical applications of extracted topics.
Healthcare 14 00282 g001
Figure 2. PRISMA format diagram of the presented systematic review selection process.
Figure 2. PRISMA format diagram of the presented systematic review selection process.
Healthcare 14 00282 g002
Figure 3. Taxonomy of topic modeling techniques applied to EHRs.
Figure 3. Taxonomy of topic modeling techniques applied to EHRs.
Healthcare 14 00282 g003
Figure 4. Distribution of selected papers across different topic modeling techniques.
Figure 4. Distribution of selected papers across different topic modeling techniques.
Healthcare 14 00282 g004
Figure 5. Number of papers by publication type (journal vs. conference).
Figure 5. Number of papers by publication type (journal vs. conference).
Healthcare 14 00282 g005
Figure 6. Publication trend of selected papers over time (2015–2025).
Figure 6. Publication trend of selected papers over time (2015–2025).
Healthcare 14 00282 g006
Figure 7. Distribution of selected papers across major publication venues.
Figure 7. Distribution of selected papers across major publication venues.
Healthcare 14 00282 g007
Figure 8. Geographic distribution of publications by country.
Figure 8. Geographic distribution of publications by country.
Healthcare 14 00282 g008
Figure 9. Timeline of key topic modeling methods in healthcare.
Figure 9. Timeline of key topic modeling methods in healthcare.
Healthcare 14 00282 g009
Table 1. Inclusion and exclusion criteria for articles in this SLR.
Table 1. Inclusion and exclusion criteria for articles in this SLR.
Inclusion Criteria
Articles on topic modeling, topic modeling using Electronic Health Records, topic modeling in clinical
pathway analysis, topic modeling in healthcare, latent topic discovery in healthcare
Articles that use EHR data
Articles published in conferences, journals, and workshops
Articles published from 2015 to 2025
Articles in English language
Exclusion Criteria
Articles not about topic modeling in healthcare
Theoretical papers without empirical methodology or metrics
Gray literature (not published in any reputable venue or not peer-reviewed)
Papers published before 2015
Publications not in English
Table 2. Systematic search strategy and boolean logic.
Table 2. Systematic search strategy and boolean logic.
CategoryKeywords/Terms
Target Domain (A)“Electronic Health Records”, “EHR”, “Clinical Notes”, “Structured Records”
Core Method (B)“Topic Model*”, “Latent Dirichlet Allocation”, “LDA”, “NMF”, “Neural Topic Model”
Technical Focus (C)“Temporal”, “Longitudinal”, “Sequential”, “Transformers”, “LLM”, “Agentic”
Application (D)“Clinical Pathway”, “Phenotyping”, “Risk Stratification”, “Comorbidity”
Full Boolean String(A) AND (B) AND (C OR D)
Filters AppliedDate: 2015–2025; Language: English; Document Type: Article, Conference Paper
Table 3. Combined view of dataset categories and their methodological interactions.
Table 3. Combined view of dataset categories and their methodological interactions.
DatasetAvailabilityModalityClinical DomainCountryDominant TechniquesMethodological Interaction
MIMICPublicNotes + structuredCritical care/ICUUSAProbabilistic, Temporal, Neural-based ApproachesIdeal for NLP-heavy models extracting semantics from dense notes.
UK BiobankRestrictedStructured + genomicPopulation-scale studiesUKMatrix Factorization, EmbeddingSparse categorical matrices favor dimensionality reduction.
Proprietary EHRsRestricted/ ProprietaryMulti-modal (Notes, codes, labs)Multiple domainsMultipleTemporal, Embedding/Neural-based ApproachesFragmented, longitudinal events necessitate temporal and neural layers.
Table 4. Legend of symbols and rating scales used in the comparative analysis tables.
Table 4. Legend of symbols and rating scales used in the comparative analysis tables.
Metric/Symbol
SensitivityHighly sensitiveModerately sensitiveLow sensitivity
Computation CostHigh costModerate costLow cost
ComplexityHigh complexityModerate complexityLow complexity
ScalabilityLow scalabilityModerate scalabilityHigh scalability
SymbolMeaning
Included/Addressed:Present in the proposed framework/model
Not Included/Absent:Not addressed or missing in the model
Table 5. Comparative analysis of classical approaches for topic modeling.
Table 5. Comparative analysis of classical approaches for topic modeling.
Ref. No.YearTechniqueData ModalitySequence
Discovery
Rule-Based/
Heuristic
Reliance on
Clinical Ontology
Qualitative
Evaluation
GeneralizabilityDiscrete Patient
Clusters Discovery
Coding/Log
Quality Dependency
[24]2019Consensus ClusteringStructured Only
[28]2022ICD TM (Code Co-occurrence)Structured Only
[26]2022MEDCP (Process Mining)Structured Only
[27]2024Patient Similarity + Process MiningStructured Only
[2]2018Patient Similarity + Pattern MiningStructured Only
[25]2018Sequence/Pattern MiningStructured Only
[8]2024TF-IDF + Supervised ClassifierMixed (Text+Struct)
Table 6. Comparative analysis of probabilistic modeling approaches for topic modeling.
Table 6. Comparative analysis of probabilistic modeling approaches for topic modeling.
Ref. No.YearTechniqueModeling TypeData ModalityData LimitationQualitative
Evaluation
Quantitative
Evaluation
Short/Noisy
Text
Auxiliary
Learning
Computation
Cost
[21]2020LDA (Gensim)PhenotypeText OnlyNote Quality
[49]2023LDA+text miningPhenotypeMulti-ModalSingle Site
[42]2023LDA+clusteringPhenotypeMulti-ModalPreprocessing Sensitivity
[48]2021MixEHR-S (Supervised BTM)PredictionMulti-ModalCurated Inputs
[31]2018LDAThematic AnalysisText OnlyNote Quality
[44]2024MixEHR-Nest (Hier. Guided TM)PhenotypeMulti-ModalOntology Mapping Quality
[35]2023G-ETM (Graph Embedded TM)PhenotypeMulti-ModalGraph Quality
[51]2020LDA+PDM clusteringPhenotypeStructured OnlyCoding Bias
[40]2021MNTM (Multi-Note TM)Thematic AnalysisText OnlyNote Type Labels
[38]2019LDA(feature for ITR)Risk Strat.Multi-ModalAssumption Sensitive
[30]2018LDA + supervised classificationPredictionText OnlyPreprocessing Sensitivity
[53]2021LDA (annotation for surveillance)SurveillanceText OnlyShort-Text Sensitivity
[39]2024LDA (urology clinical text)Thematic AnalysisText OnlySmall Dataset
[45]2024MixEHR-SurG (guided BTM+Cox)Risk Strat.Multi-ModalPheCode Priors Quality
[9]2021LDA + Bayesian optimizationThematic AnalysisText OnlyMetric-Dependent Performance
[17]2020MixEHR (multi-modal BTM)PhenotypeMulti-ModalDiscretization/Scaling
[47]2022LDA + feature-engineered ICDPredictionText OnlyPolysemy/Short Text
[36]2018BTM + TF-IDF classifiersPredictionText OnlyNoisy Texts
[52]2022Sequence latent variable modelPhenotypeStructured OnlyComplex Inference
[41]2022CTM→STM (two-stage TM)Thematic AnalysisText OnlyNote Completeness
[6]2015PRSM (LDA-based for risk strat.)Risk Strat.Structured OnlyCoded EHR Quality
[37]2024LDA-style (clinical notes)Thematic AnalysisText OnlyNote Quality/Variability
[10]2024LDA (low-resource language)Thematic AnalysisText OnlyLow-Resource Language
[34]2016LDA-style (co-occurrence)Thematic AnalysisMulti-ModalCoding/Preprocessing
[46]2017Clinical LDA variantPredictionStructured OnlyInstitution-Specific Bias
[32]2020ETM (LDA+ supervised classifier)PredictionText OnlyLDA Weak for Short Text
[43]2022MixEHR-Guided (semi-supervised)PhenotypeMulti-ModalNeeds Strong Surrogates
[50]2023LDA+ChatGPT (TM+LLM)Thematic AnalysisText OnlyLLM Hallucination
[3]2024LDA/BERTopic + clusteringPhenotypeMulti-ModalText/Site Variation
[33]2021LDA-like + supervised LOS pred.PredictionText OnlyLimited Generalizability
Table 7. Comparative analysis of matrix and tensor factorization approaches for topic modeling.
Table 7. Comparative analysis of matrix and tensor factorization approaches for topic modeling.
Ref. No.YearTechniqueFactorization TypeData ModalityTemporal
Dynamics
Subphenotype
Utility
Handles
Fuzziness
Computation
Cost
Sensitivity to
Data Sparsity
Model
Extensibility
[54]2019Constrained NTF (CP/PARAFAC)TensorStructured Only
[59]2019NMF (Phenome-Genome)MatrixStructured Only
[58]2019LSI (SVD-based)MatrixStructured Only
[29]2020NMF (Temporal Topic Modeling)MatrixText Only
[55]2017Constrained NTFTensorStructured Only
[57]2020NMF (Temporal Multimorbididty Phenotyping)MatrixStructured Only
[23]2020PARAFAC2 + NMF (Joint TF)TensorStructured Only
[56]2018FLSA (Fuzzy LSA)MatrixText Only
Table 8. Comparative analysis of transformer and deep neural network-based modeling approaches for topic modeling.
Table 8. Comparative analysis of transformer and deep neural network-based modeling approaches for topic modeling.
Ref. No.YearTechniqueCore Architecture TypeData
Modality
Learning
Strategy
Temporal
Modeling
High Prediction
Accuracy
Model
Extensibility
Interpretability
Mechanism
Computation
Cost
[64]2025VaDeSCEHRRNN/DNN/ VAEStructured Codes OnlyEnd-to-End Sup.Topic-Informed
[20]2018Deep DNN on FHIRRNN/DNN/ VAEMulti-ModalEnd-to-End Sup.
[73]2021BERT-style embeddingsTransformerStructured Codes OnlySelf-Sup.
[22]2025Temporal deep rep + clusteringRNN/DNN/ VAEMulti-ModalSemi-Sup.Topic-Informed
[67]2023Transformer (BEHRT extension)TransformerMulti-ModalEnd-to-End Sup.Feature Importance
[65]2024SMDBERT++TransformerText OnlySemi-Sup.
[72]2024LATTEHybrid DeepStructured Codes OnlySemi-Sup.Feature Importance
[76]2024SeDDLeRRNN/DNN/ VAEStructured Codes OnlySemi-Sup.
[63]2021Bidirectional TransformerTransformerMulti-ModalSelf-Sup.
[75]2021Hybrid: Hierarchical + TopicHybrid DeepMulti-ModalEnd-to-End Sup.Topic-Informed
[13]2016Stacked Denoising AERNN/DNN/ VAEStructured Codes OnlySelf-Sup.
[69]2021HCET (Topic-informed Hier.)Hybrid DeepMulti-ModalEnd-to-End Sup.Topic-Informed
[62]2021BERT-style TransformerTransformerStructured Codes OnlySelf-Sup.
[71]2023Hierarchical TransformerTransformerMulti-ModalSelf-Sup.
[68]2024GPT-style TransformerTransformerMulti-ModalSelf-Sup.
[16]2019ClinicalBERTTransformerText OnlySelf-Sup.
[70]2024HEART (Rel. Aware Trans.)TransformerStructured Codes OnlySelf-Sup.
[66]2025LLM-assisted pathway extractionTransformerMulti-ModalSemi-Sup.Topic-Informed
[74]2021Recurrent temporal modelRNN/DNN/ VAEStructured Codes OnlyEnd-to-End Sup.
[77]2023Transformer encoder-decoderTransformerStructured Codes OnlySelf-Sup.
[78]2023BERTopicTransformerText OnlyUnsup.Expert Validation
[7]2025LDA vs BERTopicTransformerText OnlyUnsup.Topic-Informed
[61]2024Seed-guided semi-supervised TMRNN/DNN/ VAEText OnlySemi-Sup.Topic-Informed
Table 9. Comparative analysis of temporal approaches for topic modeling.
Table 9. Comparative analysis of temporal approaches for topic modeling.
Ref. No.YearTechniqueCore Temporal MechanismData ModalityModeling FocusInterpretation FocusModel
Complexity
ScalabilitySensitivity to
Binning/
Preprocessing
[5]2024Probabilistic Latent State ModelHMM/Latent StateStructured SequencesComorbidityComorbidity Linkage
[83]2024Hidden Markov Model (HMM)HMM/Latent StateStructured SequencesPhenotypeState Transition
[79]2024Temporal Topic Model (Unsupervised LDA)Time-Aware LDAStructured SequencesPhenotypeTopic Trajectory
[80]2021Temporal Topic Model (LDA + time)Time-Aware LDAStructured SequencesTreatment FlowTopic Trajectory
[81]2024Hidden Markov Model (HMM)HMM/Latent StateStructured SequencesProgressionState Transition
[84]2024Dynamic Topic Modeling (DTM)Dynamic Topic ModelTextEvolutionTopic Trajectory
[82]2018Latent Treatment Topic Model (HMM-like)HMM/Latent StateStructured SequencesTreatment FlowState Transition
Table 10. Comparative analysis of hybrid approaches for topic modeling.
Table 10. Comparative analysis of hybrid approaches for topic modeling.
Ref. No.YearTechniqueHybridization TypeInterpretability MechanismData ModalityIncorporates
Deep Learning
Handles
Temporal Data
Computation
Cost
Sensitivity
to Fusion
Predictive
Accuracy
[4]2019FKLSA (Fuzzy K-Means + LSA + PCA)Feature FusionTopic CoherenceText
[86]2025LDA + BiLSTMFeature FusionTopic CoherenceMixed
[18]2022KG-TM (Probabilistic Topic + KG Embeddings)Knowledge-GuidedConcept LinkingStructured
[85]2023Fuzzy Process Mining + TransformerFeature FusionFeature ImportanceStructured
Table 11. Systematic comparison of the impact of key challenges across topic modeling families.
Table 11. Systematic comparison of the impact of key challenges across topic modeling families.
TechniqueImpact of ScalabilityImpact of InterpretabilityImpact of Temporal ComplexityImpact of Privacy Concerns
Classical TechniquesScalable for mid-sized EHRs; efficiency drops with expanding clinical vocabularies.Frequency-based transparency of reason; lacks multi-faceted topic depth.Static architecture; unable to model sequential clinical visits or disease progression.Requires centralized raw data; complicates patient de-identification.
Probabilistic Topic ModelingSlow inference on large-scale EHRs; limits real-time decision support.High clinical utility via word-probability distributions matching medical terminology.Standard bag-of-words approach; ignores clinical event ordering.Raw co-occurrence reliance; hinders decentralized/private implementation.
Matrix and Tensor FactorizationMathematically scalable but high storage demand for sparse matrices.Readable mathematical components; lacks probabilistic uncertainty modeling.Supports time dimensions; complexity grows exponentially with visit frequency.Operates on aggregated, de-identified matrices; avoids raw text exposure.
Embedding and Neural Topic ModelingHighly robust; handles millions of records via mini-batch SGD.Abstract latent spaces; difficult for clinicians to audit topic assignments.Static by default; requires sequential layers (LSTM/Transformers) which increase training time.Federated learning compatible; shares model weights without exposing raw EHR data.
Temporal ModelsIncreased complexity per time-step; long-term longitudinal analysis is significantly slower.High utility for tracking disease evolution and patient journey trajectories.Natively designed for temporal EHR richness; addresses a key SLR gap.Identifiable sequential patterns; harder to anonymize than static records.
Hybrid ModelsDependent on complex components; combination overhead slows large-scale EHR processing.Enhanced utility; leverages multiple methods to compensate for individual interpretability weaknesses.Often acts as a temporal patch for static models; prone to sensitivity issues during feature fusion.Variable; contingent on reliance upon raw text features versus processed embeddings.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mehmood, I.; Zahra, Z.; Iqbal, S.; Qahmash, A.; Hussain, I. A Systematic Review of Topic Modeling Techniques for Electronic Health Records. Healthcare 2026, 14, 282. https://doi.org/10.3390/healthcare14020282

AMA Style

Mehmood I, Zahra Z, Iqbal S, Qahmash A, Hussain I. A Systematic Review of Topic Modeling Techniques for Electronic Health Records. Healthcare. 2026; 14(2):282. https://doi.org/10.3390/healthcare14020282

Chicago/Turabian Style

Mehmood, Iqra, Zoya Zahra, Sarah Iqbal, Ayman Qahmash, and Ijaz Hussain. 2026. "A Systematic Review of Topic Modeling Techniques for Electronic Health Records" Healthcare 14, no. 2: 282. https://doi.org/10.3390/healthcare14020282

APA Style

Mehmood, I., Zahra, Z., Iqbal, S., Qahmash, A., & Hussain, I. (2026). A Systematic Review of Topic Modeling Techniques for Electronic Health Records. Healthcare, 14(2), 282. https://doi.org/10.3390/healthcare14020282

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop