BERT-Based Models for Normalization of Adverse Drug Event Expressions in Social Media to Standard Medical Terminology for Drug Safety Analysis
Abstract
1. Introduction
2. Methods
2.1. Study Design
2.2. SMM4H Dataset
2.3. CADEC Dataset
2.4. Datasets for Model Development
2.5. Model Architecture
2.6. Selection of Algorithmic Parameters
2.7. Performance Metrics
3. Results
3.1. Training-Loss-Based Parameters Selection
3.2. Model Performance
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Chokkakula, S.; Yang, H.; Al-Masri, A.A.; Zhang, Y.; Naveen, B.; Yang, B. The post-marketing safety of venlafaxine: A real-world two-decade pharmacovigilance study using the FAERS database. Front. Pharmacol. 2026, 17, 1737113. [Google Scholar] [CrossRef]
- Zhou, C.; Peng, S.; Lin, A.; Jiang, A.; Peng, Y.; Gu, T.; Liu, Z.; Cheng, Q.; Zhang, J.; Luo, P. Psychiatric disorders associated with immune checkpoint inhibitors: A pharmacovigilance analysis of the FDA Adverse Event Reporting System (FAERS) database. eClinicalMedicine 2023, 59, 101967. [Google Scholar] [CrossRef] [PubMed]
- Guo, W.; Pan, B.; Sakkiah, S.; Ji, Z.; Yavas, G.; Lu, Y.; Komatsu, T.E.; Lal-Nag, M.; Tong, W.; Patterson, T.A.; et al. Informing selection of drugs for COVID-19 treatment through adverse events analysis. Sci. Rep. 2021, 11, 14022. [Google Scholar] [CrossRef] [PubMed]
- Wessel, D.; Pogrebnyakov, N. Using Social Media as a Source of Real-World Data for Pharmaceutical Drug Development and Regulatory Decision Making. Drug Saf. 2024, 47, 495–511. [Google Scholar] [CrossRef]
- Golder, S.; O’Connor, K.; Wang, Y.; Klein, A.; Gonzalez Hernandez, G. The Value of Social Media Analysis for Adverse Events Detection and Pharmacovigilance: Scoping Review. JMIR Public Health Surveill. 2024, 10, e59167. [Google Scholar] [CrossRef] [PubMed]
- Golder, S.; O’Connor, K.; Wang, Y.; Gonzalez Hernandez, G. The Role of Social Media for Identifying Adverse Drug Events Data in Pharmacovigilance: Protocol for a Scoping Review. JMIR Res. Protoc. 2023, 12, e47068. [Google Scholar] [CrossRef]
- Lee, J.Y.; Lee, Y.S.; Kim, D.H.; Lee, H.S.; Yang, B.R.; Kim, M.G. The Use of Social Media in Detecting Drug Safety-Related New Black Box Warnings, Labeling Changes, or Withdrawals: Scoping Review. JMIR Public Health Surveill. 2021, 7, e30137. [Google Scholar] [CrossRef]
- Golder, S.; Smith, K.; O’Connor, K.; Gross, R.; Hennessy, S.; Gonzalez-Hernandez, G. A Comparative View of Reported Adverse Effects of Statins in Social Media, Regulatory Data, Drug Information Databases and Systematic Reviews. Drug Saf. 2021, 44, 167–179. [Google Scholar] [CrossRef]
- Dong, F.; Guo, W.; Liu, J.; Patterson, T.A.; Hong, H. Pharmacovigilance in the digital age: Gaining insight from social media data. Exp. Biol. Med. 2025, 250, 10555. [Google Scholar] [CrossRef]
- Smith, K.; Golder, S.; Sarker, A.; Loke, Y.; O’Connor, K.; Gonzalez-Hernandez, G. Methods to Compare Adverse Events in Twitter to FAERS, Drug Information Databases, and Systematic Reviews: Proof of Concept with Adalimumab. Drug Saf. 2018, 41, 1397–1410. [Google Scholar] [CrossRef]
- Zhang, J.; Wang, X.; Zhou, Y. Comparative analysis of semaglutide induced adverse reactions: Insights from FAERS database and social media reviews with a focus on oral vs subcutaneous administration. Front. Pharmacol. 2024, 15, 1471615. [Google Scholar] [CrossRef]
- Sarker, A.; Ginn, R.; Nikfarjam, A.; O’Connor, K.; Smith, K.; Jayaraman, S.; Upadhaya, T.; Gonzalez, G. Utilizing social media data for pharmacovigilance: A review. J. Biomed. Inform. 2015, 54, 202–212. [Google Scholar] [CrossRef]
- MacKinlay, A.; Aamer, H.; Yepes, A.J. Detection of Adverse Drug Reactions using Medical Named Entities on Twitter. AMIA Annu. Symp. Proc. 2017, 2017, 1215–1224. [Google Scholar]
- Li, Y.; Jimeno Yepes, A.; Xiao, C. Combining Social Media and FDA Adverse Event Reporting System to Detect Adverse Drug Reactions. Drug Saf. 2020, 43, 893–903. [Google Scholar] [CrossRef]
- Oyebode, O.; Orji, R. Identifying adverse drug reactions from patient reviews on social media using natural language processing. Health Inform. J. 2023, 29, 14604582221136712. [Google Scholar] [CrossRef] [PubMed]
- Khademi Habibabadi, S.; Delir Haghighi, P.; Burstein, F.; Buttery, J. Vaccine Adverse Event Mining of Twitter Conversations: 2-Phase Classification Study. JMIR Med. Inform. 2022, 10, e34305. [Google Scholar] [CrossRef]
- Terry, K.; Yang, F.; Yao, Q.; Liu, C. The role of social media in public health crises caused by infectious disease: A scoping review. BMJ Glob. Health 2023, 8, e013515. [Google Scholar] [CrossRef]
- Schellack, N.; Strydom, M.; Pepper, M.S.; Herd, C.L.; Hendricks, C.L.; Bronkhorst, E.; Meyer, J.C.; Padayachee, N.; Bangalee, V.; Truter, I.; et al. Social Media and COVID-19-Perceptions and Public Deceptions of Ivermectin, Colchicine and Hydroxychloroquine: Lessons for Future Pandemics. Antibiotics 2022, 11, 445. [Google Scholar] [CrossRef] [PubMed]
- Hussain, Z.; Sheikh, Z.; Tahir, A.; Dashtipour, K.; Gogate, M.; Sheikh, A.; Hussain, A. Artificial Intelligence-Enabled Social Media Analysis for Pharmacovigilance of COVID-19 Vaccinations in the United Kingdom: Observational Study. JMIR Public Health Surveill. 2022, 8, e32543. [Google Scholar] [CrossRef] [PubMed]
- Daluwatte, C.; Khromava, A.; Chen, Y.; Serradell, L.; Chabanon, A.L.; Chan-Ou-Teung, A.; Molony, C.; Juhaeri, J. Application of a Language Model Tool for COVID-19 Vaccine Adverse Event Monitoring Using Web and Social Media Content: Algorithm Development and Validation Study. JMIR Infodemiology 2024, 4, e53424. [Google Scholar] [CrossRef]
- Dong, F.; Guo, W.; Liu, J.; Patterson, T.A.; Hong, H. BERT-based language model for accurate drug adverse event extraction from social media: Implementation, evaluation, and contributions to pharmacovigilance practices. Front. Public Health 2024, 12, 1392180. [Google Scholar] [CrossRef]
- Dai, X.; Karimi, S.; Sarker, A.; Hachey, B.; Paris, C. MultiADE: A Multi-domain benchmark for Adverse Drug Event extraction. J. Biomed. Inform. 2024, 160, 104744. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Li, J.; He, J.; Tao, C. AE-GPT: Using Large Language Models to extract adverse events from surveillance reports—A use case with influenza vaccine adverse events. PLoS ONE 2024, 19, e0300919. [Google Scholar] [CrossRef]
- Tiftikci, M.; Özgür, A.; He, Y.; Hur, J. Machine learning-based identification and rule-based normalization of adverse drug reactions in drug labels. BMC Bioinform. 2019, 20, 707. [Google Scholar] [CrossRef] [PubMed]
- Sloane, R.; Osanlou, O.; Lewis, D.; Bollegala, D.; Maskell, S.; Pirmohamed, M. Social media and pharmacovigilance: A review of the opportunities and challenges. Br. J. Clin. Pharmacol. 2015, 80, 910–920. [Google Scholar] [CrossRef] [PubMed]
- Yu, D.; Vydiswaran, V.G.V. An Assessment of Mentions of Adverse Drug Events on Social Media with Natural Language Processing: Model Development and Analysis. JMIR Med. Inform. 2022, 10, e38140. [Google Scholar] [CrossRef]
- Miftahutdinov, Z.; Kadurin, A.; Kudrin, R.; Tutubalina, E. Medical concept normalization in clinical trials with drug and disease representation learning. Bioinformatics 2021, 37, 3856–3864. [Google Scholar] [CrossRef]
- Pappa, D.; Stergioulas, L. Harnessing social media data for pharmacovigilance: A review of current state of the art, challenges and future directions. Int. J. Data Sci. Anal. 2019, 8, 113–135. [Google Scholar] [CrossRef]
- Audeh, B.; Bellet, F.; Beyens, M.N.; Lillo-Le Louët, A.; Bousquet, C. Use of Social Media for Pharmacovigilance Activities: Key Findings and Recommendations from the Vigi4Med Project. Drug Saf. 2020, 43, 835–851. [Google Scholar] [CrossRef]
- Pérez-Pérez, M.; Igrejas, G.; Fdez-Riverola, F.; Lourenço, A. A framework to extract biomedical knowledge from gluten-related tweets: The case of dietary concerns in digital era. Artif. Intell. Med. 2021, 118, 102131. [Google Scholar] [CrossRef]
- Fisher, A.; Young, M.M.; Payer, D.; Pacheco, K.; Dubeau, C.; Mago, V. Automating Detection of Drug-Related Harms on Social Media: Machine Learning Framework. J. Med. Internet Res. 2023, 25, e43630. [Google Scholar] [CrossRef] [PubMed]
- Rezaei, Z.; Ebrahimpour-Komleh, H.; Eslami, B.; Chavoshinejad, R.; Totonchi, M. Adverse Drug Reaction Detection in Social Media by Deep Learning Methods. Cell J. 2020, 22, 319–324. [Google Scholar] [CrossRef] [PubMed]
- Murphy, R.M.; Klopotowska, J.E.; de Keizer, N.F.; Jager, K.J.; Leopold, J.H.; Dongelmans, D.A.; Abu-Hanna, A.; Schut, M.C. Adverse drug event detection using natural language processing: A scoping review of supervised learning methods. PLoS ONE 2023, 18, e0279842. [Google Scholar] [CrossRef]
- Brown, E.G.; Wood, L.; Wood, S. The medical dictionary for regulatory activities (MedDRA). Drug Saf. 1999, 20, 109–117. [Google Scholar] [CrossRef] [PubMed]
- Große-Michaelis, I.; Proestel, S.; Rao, R.M.; Dillman, B.S.; Bader-Weder, S.; Macdonald, L.; Gregory, W. MedDRA Labeling Groupings to Improve Safety Communication in Product Labels. Ther. Innov. Regul. Sci. 2023, 57, 1–6. [Google Scholar] [CrossRef]
- Kralova, K.; Wilson, C.A.; Richebourg, N.; D’Souza, J. Quality of MedDRA® Coding in a Sample of COVID-19 Vaccine Medication Error Data. Drug Saf. 2023, 46, 501–507. [Google Scholar] [CrossRef]
- Revers, A.; Hof, M.H.; Zwinderman, A.H. BAHAMA: A Bayesian Hierarchical Model for the Detection of MedDRA®-Coded Adverse Events in Randomized Controlled Trials. Drug Saf. 2022, 45, 961–970. [Google Scholar] [CrossRef]
- Chan, E.; Small, S.S.; Wickham, M.E.; Cheng, V.; Balka, E.; Hohl, C.M. The Utility of Different Data Standards to Document Adverse Drug Event Symptoms and Diagnoses: Mixed Methods Study. J. Med. Internet Res. 2021, 23, e27188. [Google Scholar] [CrossRef]
- Narayanan, S.; Mannam, K.; Achan, P.; Ramesh, M.V.; Rangan, P.V.; Rajan, S.P. A contextual multi-task neural approach to medication and adverse events identification from clinical text. J. Biomed. Inform. 2022, 125, 103960. [Google Scholar] [CrossRef]
- Li, Y.; Tao, W.; Li, Z.; Sun, Z.; Li, F.; Fenton, S.; Xu, H.; Tao, C. Artificial intelligence-powered pharmacovigilance: A review of machine and deep learning in clinical text-based adverse drug event detection for benchmark datasets. J. Biomed. Inform. 2024, 152, 104621. [Google Scholar] [CrossRef]
- Kim, S.; Kang, T.; Chung, T.K.; Choi, Y.; Hong, Y.; Jung, K.; Lee, H. Automatic Extraction of Comprehensive Drug Safety Information from Adverse Drug Event Narratives in the Korea Adverse Event Reporting System Using Natural Language Processing Techniques. Drug Saf. 2023, 46, 781–795. [Google Scholar] [CrossRef]
- Zitu, M.M.; Zhang, S.; Owen, D.H.; Chiang, C.; Li, L. Generalizability of machine learning methods in detecting adverse drug events from clinical narratives in electronic medical records. Front. Pharmacol. 2023, 14, 1218679. [Google Scholar] [CrossRef]
- Guan, H.; Devarakonda, M. Leveraging Contextual Information in Extracting Long Distance Relations from Clinical Notes. AMIA Annu. Symp. Proc. 2019, 2019, 1051–1060. [Google Scholar]
- Karapetiantz, P.; Audeh, B.; Redjdal, A.; Tiffet, T.; Bousquet, C.; Jaulent, M.C. Monitoring Adverse Drug Events in Web Forums: Evaluation of a Pipeline and Use Case Study. J. Med. Internet Res. 2024, 26, e46176. [Google Scholar] [CrossRef]
- Magge, A.; Tutubalina, E.; Miftahutdinov, Z.; Alimova, I.; Dirkson, A.; Verberne, S.; Weissenbacher, D.; Gonzalez-Hernandez, G. DeepADEMiner: A deep learning pharmacovigilance pipeline for extraction and normalization of adverse drug event mentions on Twitter. J. Am. Med. Inform. Assoc. 2021, 28, 2184–2192. [Google Scholar] [CrossRef]
- Remy, F.; Scaboro, S.; Portelli, B. Boosting Adverse Drug Event Normalization on Social Media: General-Purpose Model Initialization and Biomedical Semantic Text Similarity Benefit Zero-Shot Linking in Informal Contexts. arXiv 2023, arXiv:2308.00157. [Google Scholar] [CrossRef]
- Zhai, Y.; Bao, X.; Chersoni, E.; Portelli, B.; Gu, J.; Huang, C.-R. PolyuCBS at SMM4H 2024: LLM-based Medical Disorder and Adverse Drug Event Detection with Low-rank Adaptation. In Proceedings of the 9th Social Media Mining for Health Research and Applications (SMM4H 2024) Workshop and Shared Tasks, Bangkok, Thailand, 15 August 2024; pp. 74–78. [Google Scholar]
- Yazdani, A.; Rouhizadeh, H.; Bornet, A.; Teodoro, D. CONORM: Context-Aware Entity Normalization for Adverse Drug Event Detection. medRxiv 2023. [Google Scholar] [CrossRef]
- Elbiach, O.; Grissette, H.; Nfaoui, E.H. Leveraging Transformer Models for Enhanced Pharmacovigilance: A Comparative Analysis of ADR Extraction from Biomedical and Social Media Texts. AI 2025, 6, 31. [Google Scholar] [CrossRef]
- Magge, A.; Klein, A.; Miranda-Escalada, A.; Ali Al-Garadi, M.; Alimova, I.; Miftahutdinov, Z.; Farre, E.; Lima López, S.; Flores, I.; O’Connor, K.; et al. Overview of the Sixth Social Media Mining for Health Applications (#SMM4H) Shared Tasks at NAACL 2021. In Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task, Mexico City, Mexico, 10 June 2021; pp. 21–32. [Google Scholar]
- Karimi, S.; Metke-Jimenez, A.; Kemp, M.; Wang, C. Cadec: A corpus of adverse drug event annotations. J. Biomed. Inform. 2015, 55, 73–81. [Google Scholar] [CrossRef] [PubMed]







| SOC Code | SOC Name | ADEs in SMM4H | ADEs in CADEC | ADEs in Both |
|---|---|---|---|---|
| 10018065 | General disorders and administration site conditions | 463 | 1295 | 1758 |
| 10037175 | Psychiatric disorders | 421 | 775 | 1196 |
| 10029205 | Nervous system disorders | 316 | 471 | 787 |
| 10017947 | Gastrointestinal disorders | 89 | 627 | 716 |
| 10028395 | Musculoskeletal and connective tissue disorders | 87 | 1685 | 1772 |
| 10022891 | Investigations | 86 | 100 | 186 |
| 10027433 | Metabolism and nutrition disorders | 58 | 7 | 65 |
| 10040785 | Skin and subcutaneous tissue disorders | 43 | 284 | 327 |
| 10021428 | Immune system disorders | 30 | 5 | 35 |
| 10038738 | Respiratory, thoracic and mediastinal disorders | 22 | 144 | 166 |
| 10022117 | Injury, poisoning and procedural complications | 19 | 44 | 63 |
| 10015919 | Eye disorders | 17 | 97 | 114 |
| 10038604 | Reproductive system and breast disorders | 15 | 86 | 101 |
| 10047065 | Vascular disorders | 10 | 42 | 52 |
| 10041244 | Social circumstances | 8 | 10 | 18 |
| 10007541 | Cardiac disorders | 7 | 166 | 173 |
| 10038359 | Renal and urinary disorders | 6 | 63 | 69 |
| 10021881 | Infections and infestations | 5 | 6 | 11 |
| 10013993 | Ear and labyrinth disorders | 5 | 30 | 35 |
| 10019805 | Hepatobiliary disorders | 2 | 17 | 19 |
| 10029104 | Neoplasms benign, malignant and unspecified (incl cysts and polyps) | 2 | 2 | 4 |
| 10042613 | Surgical and medical procedures | 2 | 0 | 2 |
| 10077536 | Product issues | 1 | 0 | 1 |
| 10014698 | Endocrine disorders | 1 | 3 | 4 |
| 10010331 | Congenital, familial and genetic disorders | 1 | 0 | 1 |
| Model | Macro-Precision | Macro-Recall | Macro-F1 | 95% CI (Macro-F1) |
|---|---|---|---|---|
| SMM4H-3 | 0.76 ± 0.04 | 0.76 ± 0.03 | 0.76 ± 0.04 | [0.74, 0.78] |
| SMM4H-6 | 0.78 ± 0.04 | 0.77 ± 0.04 | 0.77 ± 0.03 | [0.76, 0.79] |
| SMM4H | 0.45 ± 0.07 | 0.47 ± 0.08 | 0.45 ± 0.07 | [0.42, 0.49] |
| CADEC-3 | 0.94 ± 0.02 | 0.94 ± 0.01 | 0.94 ± 0.01 | [0.93, 0.95] |
| CADEC-6 | 0.93 ± 0.01 | 0.92 ± 0.01 | 0.92 ± 0.01 | [0.92, 0.93] |
| CADEC | 0.74 ± 0.08 | 0.73 ± 0.09 | 0.73 ± 0.08 | [0.69, 0.77] |
| Both-3 | 0.85 ± 0.01 | 0.84 ± 0.01 | 0.84 ± 0.01 | [0.84, 0.85] |
| Both-6 | 0.86 ± 0.02 | 0.86 ± 0.01 | 0.86 ± 0.01 | [0.85, 0.87] |
| Both | 0.62 ± 0.04 | 0.59 ± 0.03 | 0.60 ± 0.03 | [0.58, 0.61] |
| SOC | Support (ADEs) | SMM4H-3 | CADEC-3 | Both-3 | SMM4H-6 | CADEC-6 | Both-6 | SMM4H | CADEC | Both |
|---|---|---|---|---|---|---|---|---|---|---|
| 10037175 | 1196 | 0.79 ± 0.06 | 0.94 ± 0.02 | 0.84 ± 0.03 | 0.75 ± 0.06 | 0.91 ± 0.03 | 0.82 ± 0.04 | 0.75 ± 0.07 | 0.91 ± 0.04 | 0.80 ± 0.03 |
| 10018065 | 1758 | 0.74 ± 0.08 | 0.95 ± 0.02 | 0.90 ± 0.03 | 0.72 ± 0.08 | 0.88 ± 0.02 | 0.84 ± 0.03 | 0.71 ± 0.10 | 0.87 ± 0.02 | 0.81 ± 0.03 |
| 10029205 | 787 | 0.76 ± 0.06 | 0.93 ± 0.04 | 0.78 ± 0.04 | 0.74 ± 0.06 | 0.91 ± 0.04 | 0.75 ± 0.04 | 0.73 ± 0.07 | 0.89 ± 0.04 | 0.74 ± 0.03 |
| 10017947 | 716 | 0.78 ± 0.11 | 0.90 ± 0.04 | 0.87 ± 0.03 | 0.71 ± 0.16 | 0.87 ± 0.05 | 0.84 ± 0.04 | |||
| 10028395 | 1772 | 0.84 ± 0.11 | 0.95 ± 0.02 | 0.95 ± 0.02 | 0.78 ± 0.12 | 0.94 ± 0.02 | 0.93 ± 0.01 | |||
| 10022891 | 186 | 0.82 ± 0.13 | 0.98 ± 0.04 | 0.93 ± 0.05 | 0.78 ± 0.14 | 0.89 ± 0.09 | 0.84 ± 0.08 | |||
| 10027433 | 65 | 0.73 ± 0.11 | 0.85 ± 0.37 | 0.59 ± 0.16 | ||||||
| 10040785 | 327 | 0.72 ± 0.21 | 0.89 ± 0.04 | 0.87 ± 0.04 | ||||||
| 10038738 | 166 | 0.89 ± 0.15 | 0.93 ± 0.07 | 0.92 ± 0.08 | ||||||
| 10022117 | 63 | 0.77 ± 0.27 | 0.65 ± 0.20 | 0.70 ± 0.11 | ||||||
| 10015919 | 114 | 0.68 ± 0.20 | 0.90 ± 0.07 | 0.87 ± 0.07 | ||||||
| 10038604 | 101 | 0.28 ± 0.30 | 0.85 ± 0.12 | 0.79 ± 0.11 | ||||||
| 10047065 | 52 | 0.10 ± 0.21 | 0.68 ± 0.20 | 0.59 ± 0.19 | ||||||
| 10021428 | 35 | 0.72 ± 0.34 | 0.15 ± 0.37 | 0.55 ± 0.27 | ||||||
| 10041244 | 18 | 0.10 ± 0.31 | 0.45 ± 0.51 | 0.30 ± 0.38 | ||||||
| 10007541 | 173 | 0.40 ± 0.50 | 0.79 ± 0.10 | 0.75 ± 0.09 | ||||||
| 10038359 | 69 | 0.50 ± 0.51 | 0.92 ± 0.11 | 0.83 ± 0.12 | ||||||
| 10021881 | 11 | 0.20 ± 0.41 | 0.35 ± 0.49 | 0.25 ± 0.26 | ||||||
| 10013993 | 35 | 0.05 ± 0.22 | 0.72 ± 0.20 | 0.62 ± 0.17 | ||||||
| 10019805 | 19 | 1.00 ± 0.00 | 0.83 ± 0.23 | 0.88 ± 0.13 | ||||||
| 10042613 | 2 | 0.05 ± 0.22 | 0.10 ± 0.31 | 0 ± 0 | ||||||
| 10029104 | 4 | 0 ± 0 | 0.75 ± 0.44 | 0 ± 0 | ||||||
| 10077536 | 1 | 0 ± 0 | 0 ± 0 | |||||||
| 10010331 | 1 | 0 ± 0 | 0 ± 0 | |||||||
| 10014698 | 4 | 0 ± 0 | 0.38 ± 0.22 |
| SOC | Support (ADEs) | SMM4H-3 | CADEC-3 | Both-3 | SMM4H-6 | CADEC-6 | Both-6 | SMM4H | CADEC | Both |
|---|---|---|---|---|---|---|---|---|---|---|
| 10037175 | 1196 | 0.78 ± 0.05 | 0.93 ± 0.03 | 0.85 ± 0.03 | 0.78 ± 0.05 | 0.92 ± 0.03 | 0.84 ± 0.03 | 0.77 ± 0.07 | 0.91 ± 0.04 | 0.81 ± 0.03 |
| 10018065 | 1758 | 0.76 ± 0.05 | 0.95 ± 0.01 | 0.88 ± 0.02 | 0.71 ± 0.05 | 0.89 ± 0.03 | 0.84 ± 0.02 | 0.69 ± 0.06 | 0.88 ± 0.04 | 0.81 ± 0.03 |
| 10029205 | 787 | 0.75 ± 0.07 | 0.94 ± 0.04 | 0.81 ± 0.03 | 0.71 ± 0.07 | 0.91 ± 0.04 | 0.79 ± 0.04 | 0.64 ± 0.05 | 0.88 ± 0.04 | 0.74 ± 0.04 |
| 10017947 | 716 | 0.80 ± 0.09 | 0.91 ± 0.03 | 0.87 ± 0.04 | 0.71 ± 0.13 | 0.87 ± 0.04 | 0.86 ± 0.04 | |||
| 10028395 | 1772 | 0.82 ± 0.12 | 0.94 ± 0.01 | 0.93 ± 0.02 | 0.73 ± 0.09 | 0.93 ± 0.01 | 0.91 ± 0.02 | |||
| 10022891 | 186 | 0.87 ± 0.10 | 0.99 ± 0.02 | 0.93 ± 0.05 | 0.77 ± 0.10 | 0.90 ± 0.07 | 0.87 ± 0.06 | |||
| 10027433 | 65 | 0.75 ± 0.13 | 0.69 ± 0.38 | 0.62 ± 0.10 | ||||||
| 10040785 | 327 | 0.78 ± 0.15 | 0.90 ± 0.05 | 0.84 ± 0.05 | ||||||
| 10038738 | 166 | 0.91 ± 0.14 | 0.94 ± 0.06 | 0.88 ± 0.07 | ||||||
| 10022117 | 63 | 0.63 ± 0.23 | 0.65 ± 0.16 | 0.56 ± 0.12 | ||||||
| 10015919 | 114 | 0.76 ± 0.21 | 0.96 ± 0.06 | 0.90 ± 0.08 | ||||||
| 10038604 | 101 | 0.30 ± 0.37 | 0.87 ± 0.11 | 0.80 ± 0.11 | ||||||
| 10047065 | 52 | 0.12 ± 0.28 | 0.69 ± 0.21 | 0.62 ± 0.16 | ||||||
| 10021428 | 35 | 0.78 ± 0.34 | 0.15 ± 0.37 | 0.72 ± 0.28 | ||||||
| 10041244 | 18 | 0.10 ± 0.31 | 0.38 ± 0.46 | 0.23 ± 0.32 | ||||||
| 10007541 | 173 | 0.29 ± 0.41 | 0.77 ± 0.09 | 0.64 ± 0.08 | ||||||
| 10038359 | 69 | 0.37 ± 0.42 | 0.87 ± 0.10 | 0.81 ± 0.11 | ||||||
| 10021881 | 11 | 0.15 ± 0.33 | 0.25 ± 0.38 | 0.28 ± 0.36 | ||||||
| 10013993 | 35 | 0.05 ± 0.22 | 0.92 ± 0.16 | 0.90 ± 0.20 | ||||||
| 10019805 | 19 | 1.00 ± 0.00 | 1.00 ± 0.00 | 0.93 ± 0.11 | ||||||
| 10042613 | 2 | 0.05 ± 0.22 | 0.10 ± 0.31 | 0 ± 0 | ||||||
| 10029104 | 4 | 0 ± 0 | 0.72 ± 0.44 | 0 ± 0 | ||||||
| 10077536 | 1 | 0 ± 0 | 0 ± 0 | |||||||
| 10010331 | 1 | 0 ± 0 | 0 ± 0 | |||||||
| 10014698 | 4 | 0 ± 0 | 0.72 ± 0.44 |
| SOC | Support (ADEs) | SMM4H-3 | CADEC-3 | Both-3 | SMM4H-6 | CADEC-6 | Both-6 | SMM4H | CADEC | Both |
|---|---|---|---|---|---|---|---|---|---|---|
| 10037175 | 1196 | 0.78 ± 0.04 | 0.93 ± 0.02 | 0.84 ± 0.02 | 0.76 ± 0.04 | 0.92 ± 0.02 | 0.83 ± 0.03 | 0.75 ± 0.05 | 0.91 ± 0.03 | 0.80 ± 0.02 |
| 10018065 | 1758 | 0.75 ± 0.05 | 0.95 ± 0.01 | 0.89 ± 0.02 | 0.71 ± 0.05 | 0.88 ± 0.02 | 0.84 ± 0.02 | 0.70 ± 0.07 | 0.87 ± 0.02 | 0.81 ± 0.02 |
| 10029205 | 787 | 0.75 ± 0.05 | 0.94 ± 0.02 | 0.80 ± 0.02 | 0.72 ± 0.06 | 0.91 ± 0.02 | 0.77 ± 0.02 | 0.68 ± 0.04 | 0.88 ± 0.03 | 0.74 ± 0.03 |
| 10017947 | 716 | 0.78 ± 0.09 | 0.91 ± 0.03 | 0.87 ± 0.03 | 0.70 ± 0.13 | 0.87 ± 0.03 | 0.85 ± 0.02 | |||
| 10028395 | 1772 | 0.82 ± 0.09 | 0.95 ± 0.01 | 0.94 ± 0.01 | 0.75 ± 0.09 | 0.93 ± 0.01 | 0.92 ± 0.01 | |||
| 10022891 | 186 | 0.83 ± 0.08 | 0.98 ± 0.02 | 0.93 ± 0.03 | 0.77 ± 0.10 | 0.89 ± 0.06 | 0.85 ± 0.04 | |||
| 10027433 | 65 | 0.73 ± 0.10 | 0.74 ± 0.36 | 0.60 ± 0.12 | ||||||
| 10040785 | 327 | 0.73 ± 0.15 | 0.89 ± 0.03 | 0.85 ± 0.03 | ||||||
| 10038738 | 166 | 0.88 ± 0.11 | 0.93 ± 0.04 | 0.90 ± 0.05 | ||||||
| 10022117 | 63 | 0.68 ± 0.22 | 0.65 ± 0.17 | 0.62 ± 0.10 | ||||||
| 10015919 | 114 | 0.69 ± 0.16 | 0.93 ± 0.05 | 0.88 ± 0.06 | ||||||
| 10038604 | 101 | 0.27 ± 0.30 | 0.85 ± 0.09 | 0.79 ± 0.08 | ||||||
| 10047065 | 52 | 0.11 ± 0.22 | 0.67 ± 0.17 | 0.59 ± 0.15 | ||||||
| 10021428 | 35 | 0.71 ± 0.30 | 0.15 ± 0.37 | 0.59 ± 0.22 | ||||||
| 10041244 | 18 | 0.10 ± 0.31 | 0.40 ± 0.47 | 0.24 ± 0.30 | ||||||
| 10007541 | 173 | 0.32 ± 0.43 | 0.77 ± 0.07 | 0.69 ± 0.06 | ||||||
| 10038359 | 69 | 0.41 ± 0.44 | 0.89 ± 0.08 | 0.82 ± 0.09 | ||||||
| 10021881 | 11 | 0.17 ± 0.35 | 0.28 ± 0.41 | 0.25 ± 0.27 | ||||||
| 10013993 | 35 | 0.05 ± 0.22 | 0.78 ± 0.15 | 0.72 ± 0.15 | ||||||
| 10019805 | 19 | 1.00 ± 0.00 | 0.89 ± 0.16 | 0.89 ± 0.08 | ||||||
| 10042613 | 2 | 0.05 ± 0.22 | 0.10 ± 0.31 | 0 ± 0 | ||||||
| 10029104 | 4 | 0 ± 0 | 0.73 ± 0.44 | 0 ± 0 | ||||||
| 10077536 | 1 | 0 ± 0 | 0 ± 0 | |||||||
| 10010331 | 1 | 0 ± 0 | 0± 0 | |||||||
| 10014698 | 4 | 0 ± 0 | 0.49 ± 0.29 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Dong, F.; Guo, W.; Liu, J.; Varghese, A.; Tong, W.; Patterson, T.A.; Hong, H. BERT-Based Models for Normalization of Adverse Drug Event Expressions in Social Media to Standard Medical Terminology for Drug Safety Analysis. Big Data Cogn. Comput. 2026, 10, 141. https://doi.org/10.3390/bdcc10050141
Dong F, Guo W, Liu J, Varghese A, Tong W, Patterson TA, Hong H. BERT-Based Models for Normalization of Adverse Drug Event Expressions in Social Media to Standard Medical Terminology for Drug Safety Analysis. Big Data and Cognitive Computing. 2026; 10(5):141. https://doi.org/10.3390/bdcc10050141
Chicago/Turabian StyleDong, Fan, Wenjing Guo, Jie Liu, Ann Varghese, Weida Tong, Tucker A. Patterson, and Huixiao Hong. 2026. "BERT-Based Models for Normalization of Adverse Drug Event Expressions in Social Media to Standard Medical Terminology for Drug Safety Analysis" Big Data and Cognitive Computing 10, no. 5: 141. https://doi.org/10.3390/bdcc10050141
APA StyleDong, F., Guo, W., Liu, J., Varghese, A., Tong, W., Patterson, T. A., & Hong, H. (2026). BERT-Based Models for Normalization of Adverse Drug Event Expressions in Social Media to Standard Medical Terminology for Drug Safety Analysis. Big Data and Cognitive Computing, 10(5), 141. https://doi.org/10.3390/bdcc10050141

