From Data to Evidence: Transformative AI for Real-World Data

A special issue of Informatics (ISSN 2227-9709).

Deadline for manuscript submissions: 31 July 2026 | Viewed by 1422

Special Issue Editors


E-Mail Website1 Website2
Guest Editor
School of Medicine, Indiana University, Indianapolis, IN 46202, USA
Interests: real-world data; electronic health records; data science; machine learning; data privacy; security; clinical informatics
Special Issues, Collections and Topics in MDPI journals
School of Medicine, Indiana University, Indianapolis, IN 46202, USA
Interests: machine learning; multimodal data; health digital twins; time-series data analysis; fair AI

Special Issue Information

Dear Colleagues, 

Real-world data (RWD), particularly electronic health records (EHRs), are increasingly being used to convert into real-world evidence (RWE) that guides clinical practice, care planning and decision-making.  Advanced artificial intelligence (AI) techniques, such as natural language processing (NLP, especially the recent surge of large language models [LLMs]), unlock value from unstructured EHR text at scale, enabling concept extraction, improved phenotyping and clinical summarization. Causal AI provides credible effect estimation and decision support from observational RWD by addressing confounding, transportability and counterfactual reasoning. Multimodal AI (e.g., health digital twins that fuse EHRs with biomedical imaging, physiological signals and knowledge graphs), supports individualized simulation, prognosis and treatment planning.  Together, these capabilities can accelerate clinical research, improve diagnostic tool development and advance the generation of robust RWE. However, important gaps remain for real-world deployment, including LLM hallucinations, privacy protection for EHRs and bias and fairness in such AI models. 

This Special Issue of the Journal of Informatics aims to improve understanding of how advanced AI tools can leverage EHRs to improve clinical research in real-world settings. We welcome methodology papers, applications, reviews and reproducible resources (datasets, benchmarks and code). High-quality submissions accepted for publication may be considered for discounts at the Editorial Office’s discretion.

Pillars and Topics of Interest

(1) RWD and Causal AI

  • Causal inference with RWD for clinical effectiveness and safety research
  • Identification and mitigation of confounding, selection bias and transportability issues
  • Causal structure learning from heterogeneous clinical data

(2) RWD and Multimodal AI

  • Multimodal fusion of EHRs with biomedical imaging, physiological signals and knowledge graphs
  • Health digital twins for prognosis, simulation and treatment planning
  • Robustness, calibration and generalizability in multimodal models

(3) LLMs/NLP on EHRs

  • Novel models and applications in predictive analytics, clinical NLP and LLMs for unstructured EHRs
  • NLP for de-identification, synthetic text generation and privacy-preserving workflows
  • Bias, fairness, robustness and hallucination mitigation for LLMs 

Prof. Dr. Jiang Bian
Dr. Yu Huang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Informatics is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • electronic health record (EHR)
  • real-world evidence (RWE)
  • large language models (LLMs)
  • digital health
  • multimodal fusion
  • causal AI

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Review

50 pages, 1686 KB  
Review
Data Foundations for Medical AI: Provenance, Reliability and Limitations of Russian Clinical NLP Resources
by Arsenii Litvinov, Lev Malishevskii, Evgeny Karpulevich, Iaroslav Bespalov, Yaroslav Nedumov, Sergey Zhdanov, Ivan Oseledets, Evgeniy Shlyakhto and Arutyun Avetisyan
Informatics 2026, 13(3), 45; https://doi.org/10.3390/informatics13030045 - 20 Mar 2026
Viewed by 913
Abstract
Russian-language resources for medical natural language processing (NLP) are expanding rapidly; however, their fragmentation, uneven curation, and limited clinical reliability hinder the development of safe machine learning systems for prognosis, prevention, and precision medicine. We provide the first systematic survey of Russian medical [...] Read more.
Russian-language resources for medical natural language processing (NLP) are expanding rapidly; however, their fragmentation, uneven curation, and limited clinical reliability hinder the development of safe machine learning systems for prognosis, prevention, and precision medicine. We provide the first systematic survey of Russian medical NLP datasets and analyze their suitability for clinically meaningful tasks as defined by the MedHELM taxonomy. We additionally perform expert clinical validation of three representative public corpora—RuMedPrimeData (real outpatient notes), MedSyn (synthetic clinical notes), and RuMedNLI (translated natural language inference)—assessing clinical plausibility, diagnosis accuracy, and logical consistency. Experts identified substantial reliability issues: across randomly sampled subsets of each corpus, only approximately 20% of RuMedPrimeData records, fewer than 15% of MedSyn records, and approximately 55% of RuMedNLI pairs met essential quality criteria, which can hinder downstream ML systems built on these data. To support robust applications—ranging from medical chatbots and triage assistants to predictive and preventive models—we outline practical requirements for high-quality datasets: coordinated, expert-validated, machine-readable corpora aligned with clinical guidelines and insurance logic, standardized de-identification, and transparent provenance. Strengthening these data foundations will enable the development of reliable, reproducible, and clinically relevant AI systems suitable for real-world healthcare applications. Full article
(This article belongs to the Special Issue From Data to Evidence: Transformative AI for Real-World Data)
Show Figures

Figure 1

Back to TopTop