Next Article in Journal
A Hybrid Architecture of CNN–Swin-T Integrated with Attention Mechanism and Explainable AI for Alzheimer’s Disease Classification
Previous Article in Journal
Hybrid Transformer Model with Augmentation for Kidney Tumor Segmentation
Previous Article in Special Issue
COVID-19 Mortality, Human Development, and Age Across the WHO Member States: A Longitudinal Multilevel Count Data Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Vaccine Perception on Digital Platforms: Topic Modeling of YouTube Comments

1
Institute of Social Sciences, Department of Management Information Systems, Akdeniz University, Antalya 07070, Turkey
2
Department of Nursing Management, Faculty of Nursing, Akdeniz University, Antalya 07070, Turkey
3
Department of Business Administration, Faculty of Economics and Administrative Sciences, Akdeniz University, Antalya 07070, Turkey
4
Department of International Trade and Logistics, Faculty of Applied Sciences, Akdeniz University, Antalya 07070, Turkey
*
Author to whom correspondence should be addressed.
Computers 2026, 15(6), 360; https://doi.org/10.3390/computers15060360
Submission received: 20 April 2026 / Revised: 30 May 2026 / Accepted: 31 May 2026 / Published: 3 June 2026

Abstract

Vaccination stands as a preeminent public health measure in the fight against infectious diseases, with a proven track record of significantly reducing morbidity and mortality rates. However, the presence of vaccine hesitancy and misinformation, particularly evident during the course of the pandemic, has emerged as a significant challenge. The present study analyzes public perceptions of vaccination by examining YouTube comments on 215 vaccine-related videos, which total over 94,000 comments. Employing advanced topic modeling techniques, such as Hierarchical Dirichlet Process (hLDA), Latent Semantic Analysis (LSA), and Non-Negative Matrix Factorization (NMF), the study identifies key themes, including vaccine safety, side effects, pharmaceutical ethics, and public trust in healthcare authorities. The findings indicate that debates frequently center on political, social, and scientific concepts. Vaccine hesitancy has emerged as a pervasive global phenomenon that transcends cultural boundaries. The dissemination of misinformation regarding the efficacy of vaccines and the safety of treatments, such as ivermectin, is a prevalent phenomenon on social media platforms. This poses significant challenges to public health efforts. The subjects of child vaccination and parental standpoints are also recurring topics of concern. This study underscores the pivotal function of digital platforms such as YouTube in influencing public attitudes regarding vaccination. This underscores the necessity for targeted communication strategies, advanced digital literacy, and proactive policies by social media platforms to address misinformation and promote evidence-based information. Such precautions are imperative to sustaining elevated vaccination rates and safeguarding public health in the digital age.

1. Introduction

Vaccines are regarded as one of the most efficacious means of safeguarding public health and averting pandemics. Historically, the documented success of vaccines in combating epidemics has led to a significant reduction in illness and mortality from infectious diseases [1]. The rapid development of mRNA vaccines during the course of the COVID-19 pandemic has once again drawn attention to the advances in vaccine technology and its role in public health [2]. The implementation of vaccination programs has been instrumental in the substantial reduction in cases of vaccine-preventable diseases, leading to the preservation of millions of lives annually [3]. Childhood immunization has been identified as a pivotal strategy in the global effort to control the spread of disease and reduce health expenditures, particularly in economically disadvantaged regions. In the 20th century, the implementation of widespread vaccination programs that led to a significant decrease in mortality rates due to infectious diseases represented a major turning point in the field of global public health [4].
The clinical efficacy of vaccination is widely acknowledged on a global scale. It has been demonstrated that contemporary immunization strategies are effective in preventing more than 20 life-threatening diseases and in fundamentally altering public health trajectories. Nevertheless, in political and fiscal discourse, implementation programs are often misrepresented as short-term financial liabilities due to the initial expenses associated with procurement, logistics, and next-generation technologies. In contrast to this cost-centric perspective, economic evaluations demonstrate that vaccination represents a high-yield, long-term strategic investment in human capital. At the microeconomic level, vaccines function as a crucial financial safeguard, shielding vulnerable households from the risk of catastrophic, out-of-pocket medical expenditures that can lead to impoverishment. On a macro level, widespread immunization has been demonstrated to mitigate systemic pressure on healthcare infrastructures and drive national economic development by minimizing workplace absenteeism, reducing disability claims, and sustaining workforce productivity [5,6,7].
Despite the significant macroeconomic and clinical benefits that have been realized, a critical bottleneck persists within the public sphere. A persistent paradox persists; empirical evidence of widespread societal resilience is often eclipsed by asymmetric risk perception, pervasive public misinformation, and an overreliance on isolated adverse events. Consequently, the challenge of contemporary public health lies not only in the development of effective therapeutics, but also in bridging the critical gap between macroeconomic reality and public risk perception [5,6,7].
Advancements in immunology have enabled the rational development of vaccines based on immune responses [5]. Moreover, network-based vaccination strategies have been demonstrated to enhance survival rates during outbreaks and curtail disease propagation [8]. In the contemporary era, the development of innovative vaccine delivery methods, including needle-free systems, has emerged as a pivotal strategy to enhance compliance and accessibility, particularly in regions characterized by limited resources [9].
A review of the history of vaccines reveals seminal contributions such as Edward Jenner’s first successful smallpox vaccine and Louis Pasteur’s anthrax and rabies vaccines, which mark the first major strides in the field of vaccination [2,10]. The eradication of smallpox through vaccination represents a critical milestone in the field of public health [11]. The polio vaccine has demonstrated the long-term benefits of vaccination, leading to the virtual eradication of the disease worldwide [12].
The COVID-19 pandemic has served to underscore the critical importance of vaccines within the broader context of public healthcare strategies. Vaccines are recognized as a cost-effective strategy for outbreak control [13]. However, there has also been an increase in the prevalence of anti-vaccination movements worldwide in recent years. The dissemination of misinformation through less credible media channels has been identified as a primary driver of anti-vaccination sentiment. This proliferation of misinformation has been shown to have a significant impact on public interest and the discourse surrounding vaccination [14]. In societies marked by low institutional trust, individuals who are resistant to vaccination tend to gravitate toward conservative media and social media platforms. This phenomenon underscores the necessity for targeted communication strategies to be implemented [15].
The phenomenon of anti-vaccination movements has evolved to transcend cultural boundaries, giving rise to the development of analogous narratives across diverse cultural contexts. This phenomenon poses a significant threat to public healthcare systems in multiple nations, propelled by factors including skepticism regarding the safety and composition of vaccines [16]. The historical underpinnings of anti-vaccine sentiment can be traced back to the 18th century, suggesting that long-standing opposition to vaccination has been further entrenched by contemporary socio-political dynamics [17]. These apprehensions, stemming from misinformation, have contributed to the resurgence of diseases such as measles and whooping cough. This phenomenon underscores the imperative for maintaining high vaccination coverage [3].
In this regard, it is imperative to sustain vaccination initiatives and to actively oppose the anti-vaccination movement to safeguard public health. The long-term benefits of vaccines in the battle against infectious diseases are twofold: they protect public health and enable a healthier and more sustainable future. Achieving global public health objectives necessitates the enhancement of vaccine coverage in low- and middle-income countries [18].
This study aims to unveil the views of the audience on vaccine and vaccination by examining the comments on vaccine-related videos shared on YouTube.

2. Materials and Methods

In this section, information on data collection, data organization and analysis stages are presented. All the analyses and figures are done using Python 3.14.

2.1. Data Collection

The data for this study consist of 94,072 comments (all in English) on 215 videos shared on the YouTube platform between January 2018 and October 2024. The keywords “Vaccine”, “BCG Vaccine”, “Baby Vaccine”, “HIV Vaccine”, “Measles Vaccination”, “Flu Vaccination”, “Monkeypox Vaccine”, “Hepatitis B Vaccine”, “COVID Vaccine”, and “Zero Dose Vaccine” are used to choose videos. General descriptions of the analyzed videos and the number of comments are presented in Table 1.

2.2. Preparation of the Data

During the data preprocessing stage, all characters in the document are converted to lowercase. Subsequently, all numerals and punctuation marks are removed from the text. Subsequently, the stop words are extracted. Subsequent to the preprocessing phase, lemmatization is implemented to derive the base form of words. The lemmatization method is a linguistic process that aims to derive the base form of a word by removing its affixes [19,20]. Lemmatization is a method of particular significance in tasks such as text indexing, topic modeling, information retrieval, and machine learning, where precise word forms are required [21]. This approach enhances the precision of tasks such as sentiment analysis and text analysis by ensuring the standardization of word forms [22].

2.3. Topic Modeling

Hierarchical Dirichlet Process (hLDA), Latent Semantic Analysis (LSA), and Non-Negative Matrix Factorization (NMF) methods were used for topic modeling of the formulated text file.

2.3.1. Hierarchical Dirichlet Process (hLDA)

hLDA is a robust Bayesian nonparametric model that extends the Dirichlet Process by incorporating a hierarchy of distributions. The model’s utility lies in its application to topic modeling in complex datasets characterized by inherent clustering of data. In such cases, the model’s functionality entails the inference of the optimal number of topics from the data itself, thereby obviating the necessity for specifying these topics a priori. LDA offers flexible topic modeling; however, interpreting its results can be challenging when the number of clusters is large or the data is complex [23,24]. The augmented hierarchical structure, while modeling topical relationships, concomitantly increases computational demands, thereby limiting its practicality for large-scale topic modeling in the context of resource constraints.

2.3.2. Latent Semantic Analysis (LSA)

LSA facilitates the extraction of topics by effectively identifying the associations between terms and documents. LSA has gained significant traction as a prevalent approach for document clustering and topic modeling. In comparison with conventional encoding methodologies, LSA provides enhanced capabilities for the analysis of data structures and the revelation of underlying semantic relations within textual content [25]. The efficacy of the LSA method is contingent upon the quality and suitability of the input data. In the absence of proper management, the results may be distorted [26]. Topics generated by LSA have been shown to be challenging to interpret. This is due to the possibility that they reflect abstract associations not directly apparent in the original text. As a result, potential misinterpretations may occur during analysis of the underlying semantic relationships [27].

2.3.3. Non-Negative Matrix Factorization (NMF)

NMF is a technique employed in the field of topic modeling. This approach has been demonstrated to be particularly effective in the identification of latent topics within large datasets. NMF de-composes a term frequency–inverse document frequency (TF-IDF) matrix, thereby enabling the identification of underlying themes without the necessity of labeled data. In comparative analyses, NMF has demonstrated topic coherence performance comparable to that of LDA and BERTopic across various types of textual data, such as literary texts and news headlines, and has been shown to be similarly effective for topic extraction in academic literature [28,29]. In addition, the NMF algorithm has demonstrated notable efficacy in the domains of classification and clustering, particularly in scenarios involving high-dimensional data [30]. It has been demonstrated that the numerical optimization process inherent to the NMF procedure may require a substantial computational investment, thereby imposing limitations on its practical application [31]. Furthermore, ambiguities in factorization have the potential to impede the interpretation of results [32].

3. Findings

This section presents the findings of topic modeling analyses conducted on the comments collected from the audience.

3.1. Analysis Results of Video Comments Using the Hierarchical Dirichlet Process Method

The hLDA method was used to analyze comments on YouTube videos about vaccines. As a result of the analysis, 10 distinct topics were identified. These topics, which represent the prominent themes in the comments, are presented in Table 2 and visualized in Figure 1.
Upon examination, Topic 1 appears to reflect discussions concerning pharmaceutical companies, medical treatments, and treatment processes. Keywords such as ‘Company,’ ‘Money,’ ‘Medication,’ and ‘Medicine’ suggest that the topic captures debates about the cost and effectiveness of treatments, as well as the role of pharmaceutical companies.
Topic 2 appears to be related to the vaccination of children and associated concerns. Keywords such as ‘Vaccine,’ ‘Child,’ ‘Autism,’ and ‘Parent’ indicate concerns about vaccine safety and reflect parents’ anxieties.
Topic 3 pertains to the biological effects of vaccines and their contributions to the immune system. Keywords such as ‘Virus,’ ‘Body,’ ‘Immune,’ ‘mRNA,’ and ‘Protein’ reflect discussions on natural immunity versus vaccine-induced immunity.
Topic 4 highlights COVID-19 and the effects of vaccines related to this disease. Keywords such as ‘COVID,’ ‘Death,’ ‘Effect,’ and ‘Hospital’ reflect concerns about the role of vaccines in preventing the illness and potential side effects. Similarly, Topic 6 includes terms like ‘Conspiracy,’ ‘Heart,’ ‘Pfizer,’ and ‘Dead,’ suggesting that it encompasses debates surrounding COVID-19 vaccines.
The fifth topic is indicative of anti-vaccine rhetoric. The terms “Anti,” “Vax,” and “Vaxxers” are indicative of discourse pertaining to the anti-vaccine movement. The utilization of keywords such as “experiment” and “dangerous” unveils the prevailing arguments employed in such discourses. In a manner analogous to the aforementioned points, Topic 7 comprises expressions such as “antivaxxers,” “propaganda,” “smallpox,” and “news.” These elements are suggestive of references to news coverage related to anti-vaccine propaganda. Furthermore, certain comments establish connections to past outbreaks through terms such as “monkeypox” and “history,” thereby underscoring the utilization of historical and recent outbreaks to inform contemporary anti-vaccine arguments.
Keywords in Topic 8, including ‘Government,’ ‘Trump,’ ‘Science,’ and ‘Trust,’ focus on credibility concerns for medical authorities and public health policies.
In contrast, the terms in Topic 9—such as ‘Blood,’ ‘Clot,’ ‘Poison,’ and ‘Humanity’—appear semantically inconsistent with one another, yet they seem to reflect comments concerning vaccine side effects and their broader societal implications.
Similarly, keywords in Topic 10, including ‘Mask,’ ‘Safe,’ ‘Truth,’ and ‘Effective,’ highlight opinions regarding measures taken in response to pandemics.
Upon examination of the Intertopic Distance Map from the HDP-LDA Topic Modeling in Figure 2, a remarkable morphological pattern is observed. It is evident that the subjects of Topic 4 (variants of “COVID-19,” and the general statistics related to health) and Topic 6 (an examination of the pharmaceutical product known as “Pfizer,” including potential side effects and speculations regarding their relationship to cardiovascular health) are distinctly separated in the left quadrant. This observation is indicative of a high degree of discriminant validity. This separation suggests that discussions centered around specific medical arguments or concrete clinical concerns give rise to more isolated and thematic discourse areas.
Conversely, a substantial overlap is observed among Topics 5, 7, 8, 9, and 10 in the right quadrant. It can be posited that Topic 5 embodies a general anti-vaccine rhetoric, while Topic 7 encompasses monkeypox, chickenpox, and historical propaganda conspiracies. Topic 8 delves into government, institutional distrust, and Trump-era policies, while Topic 9 centers on blood clotting and poisoning claims. Finally, Topic 10 addresses mask mandates and social polarization.
Contrary to the conventional wisdom, users do not formulate institutional distrust, biological concerns, and conspiracy theories as discrete logical arguments. Rather, they integrate political distrust, conspiracy claims, and biological side-effect allegations (clotting, poisoning, etc.) into a unified comment. The findings indicate an overlapping of Topics 5, 7, 8, 9, and 10, suggesting a structured, integrated conspiracy ecosystem in digital spaces that fosters vaccine hesitancy and opposition. This ecosystem is characterized by the intertwining of narratives, which, through empirical evidence, coalesce to form a homogeneous semantic cloud.

3.2. Findings from the Analysis of Video Comments Using the Latent Semantic Analysis Method

Using the Latent Semantic Analysis (LSA) method, ten distinct topics were identified from comments on vaccine-related YouTube videos. These topics are presented in Table 3 and Figure 3.
The keywords found in Topic 1—such as ‘Child,’ ‘Died,’ ‘Jab,’ ‘Life,’ ‘Vaccination,’ and ‘Disease’—appear to reflect discussions on childhood vaccination. Building on this, Topic 4 addresses concerns about vaccination risks and parental anxiety, as indicated by terms such as ‘Risk,’ ‘Measles,’ ‘Autism,’ ‘Parent,’ and ‘Child,’ which highlight debates specifically focused on the immunization of children. Furthermore, Topic 8 includes words such as ‘School,’ ‘mRNA,’ ‘Measles,’ and ‘Vaccinate,’ suggesting references to school outbreaks and issues related to vaccinating children.
The second topic is characterized by the presence of keywords such as “Ivermectin,” “Scam,” “Strain,” “Failure,” and “Positive,” which collectively imply references to treatment recommendations and related skepticism. The third topic encompasses terms such as “hope,” “informative,” “science,” and “animation.” These terms appear to reflect viewers’ attitudes toward educational and informative videos. The fifth topic pertains to comments regarding the immune system and viral infections. Key words such as “antibody,” “immune,” “protein,” “cell,” and “virus” indicate discussions of immune responses and vaccine effects. In a similar vein, Topic 6 encompasses keywords such as “HIV,” “measles,” “autism,” and “immune,” reflecting ongoing debates concerning the impact of diseases and the role of vaccination.
In Topic 7, the presence of words such as “Stupid,” “Anti,” “Vaxxers,” and “Poison” captures expressions of vaccine skepticism and concern, suggesting references to anti-vaccine viewpoints and critical attitudes toward immunization. Maintaining the emphasis on skepticism and concern, Topic 9 presents terminology such as “COVD,” “Sick,” “mRNA,” and “Vaccination,” with the commentary focusing on the discourse surrounding the novel Coronavirus and the efforts to develop vaccines. In conclusion, Topic 10 appears to underscore a pervasive sense of distrust, particularly directed towards vaccine manufacturers and governmental authorities. A preliminary analysis of the keywords reveals an underlying theme of critical commentary on political powers, pharmaceutical companies, and public health policies. Keywords such as “Company,” “Pharma,” “Money,” and “Government” appear to be particularly salient in this regard.
A close examination of the scatter plot in Figure 4 reveals that the initial cluster, situated in the left-middle quadrant, encompasses Topics 1, 6, 7, and 9. These topics collectively constitute the fundamental framework within which the discussion on vaccines is integrated. The primary focus of Topics 1 and 9 is on the biological and mortality effects of vaccination, encompassing life, death, DNA, mRNA, and cells. Topics 6 and 7 are concerned with anti-vaccine arguments (e.g., HIV, autism) and the social backlash against them (e.g., “stupid,” “vaxxers,” “anti”). The intersection of these two categories reveals that the discussions are empirically inseparable.
The second cluster, located in the upper-left quadrant, encompasses Topics 4 and 8, which address the subjects of childhood immunization and the dissemination of misinformation, respectively. The fourth topic encompasses school mandates and unvaccinated children, addressing pertinent issues such as school attendance, measles, and parental concerns. The eighth topic delves into autism, poison, and perceptions regarding mRNA. The disconnection from the primary structure indicates that discourse concerning childhood vaccines adheres to divergent semantic patterns.
The third cluster, located in the lower-right quadrant, encompasses Topics 10 and 2, which represent institutional distrust and alternative treatments, respectively. The tenth topic, “Pharma, Government, and Science,” is indicative of a pervasive distrust that encompasses the pharmaceutical industry, government entities, and scientific institutions. This theme is characterized by a confluence of factors, including perceived corruption, financial incentives, and the dissemination of misinformation, which collectively erode public trust. The second topic, which is isolated, encompasses alternative treatments such as ivermectin and associated fraudulent discourses.
Two independent topics are also identified. Topic 3 (upper-right) contains positive responses to informative content (hope, doc, informative). Topic 5 (lowest-left) focuses purely on clinical terminology (antibody, infected, immune, virus) with no discursive elements.
Based on these findings, the LSA model distinguishes four thematic areas: an integrated vaccine backbone, childhood immunization and misinformation, institutional distrust with alternative treatments, and independent neutral/clinical topics.

3.3. Findings from the Analysis of Video Comments Using the Non-Negative Matrix Factorization Method

As a result of the analysis conducted using the NMF method, vaccine-related comments were grouped around ten distinct themes. These themes are presented in Table 4.
The initial topic under consideration appears to concentrate on discourse pertaining to the ongoing global pandemic of the novel coronavirus, also known as severe acute respiratory syndrome (SARS-CoV-2), and the development of vaccines to combat it. The employment of keywords such as “infection,” “mRNA,” “booster,” “flu,” “pandemic,” and “death” serves to underscore the imperative for combating the virus and the pivotal role of vaccines in this endeavor. A content analysis of Topic 2 reveals that the keywords “heart,” “risk,” “safe,” “health,” “vaccination,” and “effect” appear to encompass general public opinions regarding vaccines. Conversely, the keywords in Topic 3, including ‘Doc,’ ‘Informative,’ ‘News,’ ‘Explanation,’ and ‘Hope,’ mirror viewers’ favorable dispositions towards educational and informative videos.
The fourth topic encompasses terms such as “herbal,” “natural,” “protein,” “immune,” “cure,” and “corona,” indicating the potential for discourse on natural remedies and their impact on the immune system. In Topic 5, the utilization of keywords such as ‘Infected,’ ‘Population,’ ‘Measles,’ ‘Protect,’ ‘Un-vaccinated,’ and ‘Vaccinated’ underscores the pivotal role of vaccination in safeguarding public health. The inclusion of keywords such as “Baby,” “Parent,” “Child,” “Polio,” “Measles,” and “Autism” suggests an emphasis on parental perspectives regarding childhood vaccination in Topic 6. Furthermore, the emphasis on these terms underscores the prevailing concerns regarding children’s health and vaccine-related issues from the viewpoint of parents.
The seventh topic contains words such as “stupid,” “idiot,” “mom,” “vax,” “movement,” and “anti,” which indicate criticism directed at anti-vaccine views. The eighth topic, which includes terms such as “Mom,” “Poor,” “Autism,” “Poison,” “Parent,” and “Kid,” underscores the heterogeneity of approaches and concerns regarding childhood immunization. The ninth topic encompasses keywords such as “cancer,” “antibody,” “mRNA,” “immune,” “protein,” and “DNA,” which direct the reader to discussions of the biological mechanisms of vaccines. Moreover, these keywords underscore their effects on the immune system.
Finally, Topic 10, with keywords such as ‘Paid,’ ‘Evil,’ ‘Pharma,’ ‘Money,’ ‘Government,’ and ‘Trust,’ reflects debates over the credibility of pharmaceutical companies and ethical concerns in public health policy.
These findings are visualized in Figure 5 and Figure 6.
Figure 6 indicates that the subjects are distributed across three primary clusters on the spatial map. The initial cluster encompasses the subjects of the COVID-19 pandemic and institutional trust issues. Of particular note is the close proximity of Topic 1, which addresses infection, mRNA, mortality, and the aforementioned pandemic, to Topic 2, which focuses on the heart and associated risks and side effects. This observation suggests a nexus between pandemic-related discourse and discussions concerning vaccine safety and risk perception., Topic 3 (doc, explanation, information, good) exhibits significant overlap with the two aforementioned topics, while Topic 10 (pharma, government, lie, trust) reflects conspiracy theories and institutional distrust, located at the periphery of the cluster.
The second cluster revolves around traditional childhood vaccines, legal mandates, and social polarization. Topics 6 and 8 are almost completely overlapping, sharing themes such as baby, measles, polio, autism, and parent. Topic 5 addresses school mandates and the risks posed by unvaccinated individuals. Topic 7 represents the pejorative and polarized language between anti-vaccine and pro-vaccine groups, positioned between the two clusters.
The third cluster is entirely isolated from the others. Topics 4 and 9 encompass cellular immune processes (DNA, mRNA, antibody, protein, cell) as well as herbal and natural remedies (natural, herbal, cure). These two topics are spatially separated from the others. In light of the aforementioned findings, it has been ascertained that the model has delineated three distinct thematic domains, namely: pandemic vaccines, vaccines for pediatric populations, and vaccines for the treatment of chronic diseases.
A comprehensive analysis of the three methods employed revealed the following subjects seen from Figure 7: political authority, health policies, and pharmaceutical companies; child vaccination and family perspectives; the impact of vaccines on the immune system; opinions on the pandemic and the vaccine development process; and discussions related to anti-vaccination sentiments. Furthermore, the hLDA and NMF methods both revealed a topic related to opinions about videos. In contrast, the LSA method identified topics exclusively related to misleading treatment suggestions and opinions on vaccines and public health. A notable finding of the hLDA analysis was the identification of vaccine side effects as a unique topic.

3.4. Performance Comparison of Subject Modeling Methods

The coherence score is a metric that quantifies the frequency with which the most probable words within a given topic co-occur in documents, thereby measuring the degree of semantic consistency between these words. This metric is designed to evaluate the interpretability and meaningfulness of a given topic to humans, thereby reflecting its alignment with expert judgments. Conversely, higher coherence scores are indicative of more coherent topics [33]. In a related manner [34] demonstrated that the coherence score improves interpretability and that novel combinations also support topic consistency.
As indicated by the findings in Table 5, the employed topic modeling methods vary with respect to accuracy and efficiency. The hLDA method demonstrated the highest coherence score of 0.7197, indicating robust topic consistency. However, its considerably long execution time of 103,406 s limits its practicality in real-world applications. In contrast, the NMF method yielded an acceptable coherence level (0.5803) and exhibited balanced accuracy and efficiency, with a reasonable execution time of 280 s. While the LSA method demonstrated the shortest execution time of 99 s, it yielded a lower coherence score of 0.5395 in comparison to the other methods. Consequently, when prioritizing high topic quality, the hLDA method emerges as a prominent approach. Conversely, for pragmatic applications, the NMF method proves to be a more suitable alternative.

4. Discussion

The three modeling methods employed in the study indicate that anti-vaccine rhetoric on YouTube is disseminated across a variety of content types and structural formats, and that these themes occupy a substantial position on the platform. The prominent themes identified in the analyzed English-language comments suggest that anti-vaccine sentiment has gone beyond an individual preference and is repeated with similar narratives across different cultural contexts. This perspective aligns with the approach of [16], who characterized vaccine opposition as a “supra-cultural” phenomenon. A study of Facebook’s capacity to disseminate messages has revealed that anti-vaccine groups exhibit a higher level of effectiveness in the propagation of their content, particularly among undecided individuals, due to their enhanced capacity to reach these targets on social media platforms [35]. This phenomenon has contributed to the global proliferation of anti-vaccine sentiments. In a similar vein [36] demonstrated that neutral individuals exhibited a stronger attitudinal alignment with anti-vaxxers, thereby facilitating the propagation of anti-vaccine rhetoric through social interaction. The prevalence of anti-vaccine rhetoric on YouTube can be attributed, at least in part, to the platform’s recommendation algorithms, which systematically direct users to content that is congruent with their existing ideological perspectives. This structural phenomenon, characterized by the creation of ideological echo chambers, has been shown to perpetuate the circulation of specific themes within these groups [37].
In contrast to conventional media outlets, social media platforms empower individuals to generate content independently and disseminate it expeditiously on a global scale. The limited regulation of social media has been demonstrated to facilitate the spread of pandemic-related conspiracy theories and misinformation. This assertion is supported by examples from across the world. A study conducted in Turkey reported that the term “plandemic” was frequently used on the social media platform Twitter. The claims associated with the term “plandemic” suggested that the pandemic was not real or that vaccines were developed as biological weapons [38,39]. A Canadian study demonstrated that social media and Google searches contributed to public anxiety and facilitated the dissemination of pandemic-related misinformation [40].
According to the LSA results, Topic 1, which occupies the largest portion of the graph, appears to be related to child deaths and vaccinations. A frequently posited argument among those who oppose vaccinations is the assertion that childhood vaccinations are a causative agent for autism [41]. In all analyses conducted in this study, the terms “child,” “vaccine,” and “autism” frequently co-occurred. However, subsequent scientific studies have demonstrated an absence of correlation between vaccines and autism [42,43].
A comprehensive analysis of the data reveals that the keywords “COVID-19“ and “polio” are associated with themes pertaining to children and vaccination. During the course of the pandemic, vaccines were instrumental in safeguarding schools and communities, thereby mitigating the spread and severity of the disease [44]. Indeed, the impact of the pandemic on children was multifaceted, encompassing both heightened infection risk and significant psychosocial and educational repercussions. COVID-19 vaccines were emphasized as critical for preventing severe illness in this age group and for strengthening herd immunity [45]. Similarly, the polio vaccine has made a major contribution to the global reduction in the disease [46]. These findings suggest that social media comments reflect not only personal beliefs, but also a collective sensitivity toward protecting child health and reinforcing public immunity.
According to the findings of the hLDA analysis, Topic 1 has been determined to be indicative of discourses centered on pharmaceutical companies, medical treatment processes, and their economic dimensions [47]. It is imperative to underscore the role of misinformation in fostering vaccine hesitancy, a phenomenon that extends beyond the general public to encompass healthcare professionals as well. In a similar vein, [48] posits that a dearth of knowledge and misperceptions can have deleterious consequences for public health.
Future research may aim to integrate the multi-model approach used in this analysis into AI-supported monitoring systems to enable the real-time detection of misinformation content on social media platforms such as YouTube. In particular, early warning systems could be developed based on specific keywords and patterns associated with anti-vaccination discourse.

5. Limitations

The present study exclusively analyzed English-language comments, thereby excluding discourse from other linguistic and cultural contexts. The selection of videos based on popularity and accessibility may have resulted in a form of sampling bias. Due to the unavailability of user verification, the extent to which bot accounts influence discourse remains indeterminate. The utilization of dimensionality reduction techniques, such as t-SNE, has been demonstrated to enhance the visual representation of data.
However, this approach concomitantly imposes limitations on the interpretability of the data. A significant constraint of the automated topic modeling approach employed in this study pertains to its reliance on semantic co-occurrence, a method that identifies broad thematic clusters rather than individual user sentiment or explicit stance. While the most prominent topics appear to suggest a narrative of distrust, the associated keywords and topic structures frequently traverse sentiment boundaries. For instance, analogous terms may emerge in both skeptical and supportive or educational discussions. The indistinct boundaries between these concepts render automated topic modeling ineffective in reliably distinguishing between skeptical behavior and neutral or positive engagement. Consequently, the findings should be interpreted as reflecting thematic prevalence rather than as a definitive measure of widespread negative sentiment.

6. Conclusions and Recommendations

This study underscores a critical intersection between global macroeconomics and digital media ecology: the long-term fiscal dividends of public health are fundamentally impeded by the structural mechanics of online misinformation. Preliminary economic analyses demonstrate that vaccination does not constitute a budgetary drain; rather, it is a high-yield strategic investment. This assertion is substantiated by its capacity to safeguard household finances from the burden of catastrophic health expenditures and to stabilize national economies by preserving workforce productivity. However, the systemic resilience required to protect these economic engines is actively undermined by the rapid, unregulated dissemination of anti-vaccine rhetoric across social digital platforms.
To this end, a multi-model computational framework was employed, incorporating NMF, LSA, and hLDA. The present study employs the following methodologies to delineate the architectural boundaries of online vaccine hesitancy. The findings indicate an evolution in anti-vaccine sentiment from a localized, individual preference to a highly coordinated phenomenon. Driven by algorithmic recommendation engines on platforms such as YouTube, these discourses flourish within insular ideological echo chambers, systematically drawing in neutral or undecided users.
The machine learning models under consideration have been demonstrated to extract positive and negative topics. In order to enhance the well-being of society, it is imperative that policymakers utilize these subjects, particularly those that are detrimental, such as the dissemination of erroneous counsel, the adverse effects of vaccines, and the perspectives on anti-vaccination. This approach is imperative for mitigating the propagation of misinformation and fostering a favorable public sentiment towards vaccination. Moreover, the dissemination of information regarding the benefits of vaccines for various demographic groups, including children, the elderly, and adolescents, can be facilitated through social media channels to ensure the general public is adequately informed.
In order to effectively combat the dissemination of misinformation, it is imperative to prioritize the updating of platform algorithms with the objective of highlighting reliable sources and supporting educational programs that aim to enhance digital literacy. Furthermore, the analysis of video content can yield novel insights into the contextual formation of discourses. It is imperative to acknowledge that addressing misinformation on social media necessitates a multifaceted approach. The multi-model analysis framework employed in this study can serve as a practical guide for these efforts.

Author Contributions

Conceptualization, Ö.T. and U.S.; methodology, U.S. and E.E.; formal analysis, U.S. and I.H.; investigation, E.E. and I.H.; data curation, U.S. and E.E.; writing—original draft preparation, Ö.T. and E.E.; writing—review and editing, I.H.; visualization, U.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Simonenko, V.B.; Abashin, V.G. Vaccination as protection against particularly dangerous infections and infectious processes that create public health emergencies of international importance. Clin. Med. (Russ. J.) 2022, 100, 5–10. [Google Scholar]
  2. Philippon, A. La saga des vaccins à ARNm. Bull. l’Académie Vétérinaire Fr. 2023, 176, 326–329. [Google Scholar] [CrossRef]
  3. Bezbaruah, R.; Sailo, N.; Mawii, Z.; Deka, K.; Bhutia, Y.D.; Kakoti, B. Myths and facts about vaccination. In Advanced Vaccination Technologies for Infectious and Chronic Diseases; Chavda, V.P., Vora, L.K., Apostolopoulos, V., Eds.; Stacy Masucci: London, UK, 2024. [Google Scholar]
  4. Plotkin, S.L.; Plotkin, S.A. A Short History of Vaccination. In Plotkin’s Vaccines, 7th ed.; Plotkin, S.A., Orenstein, W.A., Offit, P.A., Edwards, K.M., Eds.; Elsevier: Philadelphia, PA, USA, 2018. [Google Scholar]
  5. Immunization Agenda 2030 Partners. Immunization agenda 2030: A global strategy to leave no one behind. Vaccine 2024, 42, S5–S14. [Google Scholar] [CrossRef]
  6. Boccalini, S. Value of Vaccinations: A Fundamental Public Health Priority to Be Fully Evaluated. Vaccines 2025, 13, 479. [Google Scholar] [CrossRef]
  7. O’Brien, K.L.; Lemango, E.; Nandy, R.; Lindstrand, A. The immunization Agenda 2030: A vision of global impact, reaching all, grounded in the realities of a changing world. Vaccine 2024, 42, S1–S4. [Google Scholar] [CrossRef]
  8. Saman, S.; Chauhan, I.; Srivastava, N. Vaccines: An Important Tool for Infectious Disease. Recent Adv. Anti-Infect. Drug Discov. 2023, 18, 88–109. [Google Scholar] [CrossRef]
  9. Chatterjee, S.; Zehmakan, A.N. Effective Vaccination Strategies in Network-based SIR Model. arXiv 2023, arXiv:2305.16458v1. [Google Scholar] [CrossRef]
  10. Contreras, M.; Grazlewska, W.; Kasaija, P.D.; Fuente, J. Vaccines. In Handbook of Molecular Biotechnology; Liu, D., Ed.; CRC Press: Boca Raton, FL, USA, 2024; pp. 1–21. [Google Scholar]
  11. Kayser, V.; Ramzan, I. Vaccines and vaccination: History and emerging issues. Hum. Vaccin. Immunother. 2021, 17, 5255–5268. [Google Scholar] [CrossRef]
  12. Dasari, S. Advances in vaccination to combat pandemic outbreaks. In Pandemic Outbreaks in the 21st Century; Viswanath, B., Ed.; Academic Press: Cambridge, MA, USA; Elsevier: Amsterdam, The Netherlands, 2021. [Google Scholar]
  13. Zhong, J.; Zhuang, Y.; Zhang, M. Impact of epidemic prevention policies on public vaccination willingness: Empirical research in China. Front. Public Health 2024, 12, 1329228. [Google Scholar] [CrossRef]
  14. Siwakoti, S.; Evans, N.; Saphiro, J.N. Less reliable media drive interest in anti-vaccine information. Harv. Kennedy Sch. Misinformation Rev. 2023, 4, 1–15. [Google Scholar] [CrossRef]
  15. Green, J.; Druckman, J.N.; Baum, M.A.; Ognyanova, K.; Simonson, M.D.; Perlis, R.H.; Lazer, D. Media use and vaccine resistance. PNAS Nexus 2023, 2, pgad146. [Google Scholar] [CrossRef]
  16. Vučurović, M.; Jovanovic, V.; Krcmar, O. Anti-Vaccination movement as a supracultural phenomenon. Eur. J. Public Health 2023, 33, 558–569. [Google Scholar] [CrossRef]
  17. Stenberg, J.; Chea, E.; Tse, J.; Thornock, B.; Stephenson, L.G. A Physician’s Approach to the Vaccine-Hesitant Patient. Ann. Public Health Rep. 2023, 7, 323–327. [Google Scholar]
  18. Amendola, A.; Canuti, M. Vaccine-Preventable Diseases. In Global Health Essentials; Raviglione, M.C., Tediosi, F., Villa, S., Casamitjana, N., Plasència, A., Eds.; Sustainable Development Goals Series; Springer: Cham, Switzerland, 2023; pp. 117–127. [Google Scholar]
  19. Raulji, J.K.; Saini, J.R. Sanskrit lemmatizer for improvisation of morphological analyzer. J. Stat. Manag. Syst. 2019, 22, 613–625. [Google Scholar] [CrossRef]
  20. Trishala, G.; Mamatha, H.R. Implementation of Stemmer and Lemmatizer for a Low-Resource Language-Kannada. In Proceedings of International Conference on Intelligent Computing, Information and Control Systems; Springer: Singapore, 2021; pp. 345–358. [Google Scholar]
  21. Mohamed, S.A.; Mohamed, M.A. Lexicon and Rule-Based Word Lemmatization Approach for the Somali Language. arXiv 2023, arXiv:2308.01785. [Google Scholar] [CrossRef]
  22. Toporkov, O.; Agerri, R. On the Role of Morphological Information for Contextual Lemmatization. arXiv 2023, arXiv:2302.00407v3. [Google Scholar] [CrossRef]
  23. Teh, Y.W.; Jordan, M.I.; Beal, M.J.; Blei, D.M. Hierarchical Dirichlet Processes. J. Am. Stat. Assoc. 2006, 101, 1566–1581. [Google Scholar] [CrossRef]
  24. Buntine, W. Understanding Hierarchical Processes. Entropy 2022, 24, 1703. [Google Scholar] [CrossRef]
  25. Sheluhın, O.I.; Vanyushına, A.V.; Zhelnov, M.S. Use of Latent-Semantic Analysis in Preparation of Data For Identification of Anonymous Users By Digital Fingerprints. H&ES Res. 2022, 14, 36–44. [Google Scholar] [CrossRef]
  26. Kountouri, I.; Manousakis, E.G.; Tsekrekos, A.E. Latent semantic analysis of corporate social responsibility reports (with an application to Hellenic firms). Int. J. Discl. Gov. 2019, 14, 1–19. [Google Scholar] [CrossRef]
  27. Jorge-Botana, G.; Olmos, R.; Luzón, J.M. Bridging the theoretical gap between semantic representation models without the pressure of a ranking: Some lessons learnt from LSA. Cogn. Process. 2019, 21, 1–21. [Google Scholar] [CrossRef]
  28. Babalola, O.; Ojokoh, B.; Boyinbode, O. Comprehensive Evaluation of LDA, NMF, and BERTopic’s Performance on News Headline Topic Modeling. J. Comput. Theo. App. 2024, 2, 268–289. [Google Scholar] [CrossRef]
  29. Pavithra, C.B.; Savitha, J. Topic Modeling for Evolving Textual Data Using LDA, HDP, NMF, BERTOPIC, and DTM With a Focus on Research Papers. J. Technol. Inform. 2024, 5, 53–63. [Google Scholar] [CrossRef]
  30. Dutta, P.; De, R.K. MDSR-NMF: Multiple deconstruction single reconstruction deep neural network model for non-negative matrix factorization. Netw. Comput. Neural Syst. 2023, 34, 306–342. [Google Scholar] [CrossRef] [PubMed]
  31. Kobayashi, T.; Watanabe, K. End-to-End Trainable Weakly Non-Negative Factorization. In 2023 IEEE International Conference on Image Processing (ICIP); IEEE: Piscataway, NJ, USA, 2023; pp. 490–494. [Google Scholar]
  32. Brie, D.; Gillis, N.; Gillis, N.; Moussao, S. Non-negative Matrix Factorization. In Source Separation in Physical-Chemical Sensing; Jutten, C., Duarte, L.T., Moussaoui, S., Eds.; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2023. [Google Scholar] [CrossRef]
  33. Mimno, D.; Wallach, H.M.; Talley, E.; Leenders, M.; McCallum, A. Optimizing semantic coherence in topic models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP ‘11); Association for Computational Linguistics: Stroudsburg, PA, USA, 2011; pp. 262–272. [Google Scholar]
  34. Röder, M.; Both, A.; Hinneburg, A. Exploring the Space of Topic Coherence Measures. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining (WSDM ‘15); Association for Computing Machinery: New York, NY, USA, 2015; pp. 399–408. [Google Scholar] [CrossRef]
  35. Johnson, N.F.; Velásquez, N.; Restrepo, N.J.; Leahy, R.; Gabriel, N.; El Oud, S.; Lupu, Y. The online competition between pro-and anti-vaccination views. Nature 2020, 582, 230–233. [Google Scholar] [CrossRef]
  36. Carpentras, D.; Lüders, A.; Quayle, M. Mapping the global opinion space to explain anti-vaccine attraction. Sci. Rep. 2022, 12, 6188. [Google Scholar] [CrossRef]
  37. Cinelli, M.; De Francisci Morales, G.; Galeazzi, A.; Quattrociocchi, W.; Starnini, M. The echo chamber effect on social media. Proc. Natl. Acad. Sci. USA 2021, 118, e2023301118. [Google Scholar] [CrossRef]
  38. Puri, N.; Coomes, E.A.; Haghbayan, H.; Gunaratne, K. Social media and vaccine hesitancy: New updates for the era of COVID-19 and globalized infectious diseases. Hum. Vaccin. Immunother. 2020, 16, 2586–2593. [Google Scholar] [CrossRef]
  39. Sütçü, K.; Tekerek, B.; Özler, G. Aşı karışıtı Twitter paylaşımlarının metin madenciliği ve içerik analizi yöntemiyle incelenmesi. Hacet. Sağlık İdaresi Derg. 2022, 25, 827–838. [Google Scholar]
  40. Yousefinaghani, S.D. Prediction of COVID-19 waves using social media and Google search: A case study of the US and Canada. Front. Public Health 2021, 9, 656635. [Google Scholar] [CrossRef] [PubMed]
  41. Tellez, E. The Link Between Vaccination and Autism. Virginia Community College System. 2018, pp. 1–8. Available online: https://commons.vccs.edu/cgi/viewcontent.cgi?article=1026&context=student_writing (accessed on 30 May 2026).
  42. DeStefano, F.; Shimabukuro, T.T. The MMR Vaccine and Autism. Annu. Rev. Virol. 2019, 6, 585–600. [Google Scholar] [CrossRef]
  43. Matthew, Z.; Daniel, A.; Salmon, N.; Neal, A.; Halsey, W. Do Vaccines Cause Autism? In The Clinician’s Vaccine Safety Resource Guide; Springer: Cham, Switzerland, 2018; pp. 197–204. [Google Scholar]
  44. Agrawal, S.; Dayama, S.; Galthora, A. Vaccinate every child against COVID-19: A scientific or socioeconomic need? J. Fam. Med. Prim. Care 2022, 11, 1658–1663. [Google Scholar] [CrossRef]
  45. Weber, D.J.; Zimmerman, K.O.; Tartof, S.Y.; McLaughlin, J.M.; Pather, S. Risk of COVID-19 in Children throughout the Pandemic and the Role of Vaccination: A Narrative Review. Vaccines 2024, 12, 989. [Google Scholar] [CrossRef]
  46. Silva, B.C.; Pinto, F.F.; Araujo, I.; Barros, G.B.; Silvério, A.S. The Current Relevance of Vaccination For Covıd-19, Influenza, Poliomyelitis, Measles And Smallpox. Int. J. Dev. Res. 2023, 13, 62935–62939. [Google Scholar]
  47. Subramaniam, D.S.; Al Juboori, R.; Bianco, A.; Subramaniam, D.P. Knowledge and behavioral beliefs related to vaccination hesitancy among healthcare workers. Front. Public Health 2024, 12, 1518112. [Google Scholar] [CrossRef]
  48. Youssef, D.; Abou-Abass, L.; Issa, O.; Youssef, J.; Hassan, H. Unlocking the keys to COVID-19 vaccine acceptance: Insights from healthcare workers and the general population. Discov. Soc. Sci. Health 2024, 4, 79. [Google Scholar] [CrossRef]
Figure 1. Visualization of topics derived from the Hierarchical Dirichlet Process using Word Cloud.
Figure 1. Visualization of topics derived from the Hierarchical Dirichlet Process using Word Cloud.
Computers 15 00360 g001
Figure 2. Intertopic distance map of hLDA topic modeling.
Figure 2. Intertopic distance map of hLDA topic modeling.
Computers 15 00360 g002
Figure 3. Latent semantic analysis results with Word Cloud table.
Figure 3. Latent semantic analysis results with Word Cloud table.
Computers 15 00360 g003
Figure 4. Intertopic distance map of LSA topic modeling.
Figure 4. Intertopic distance map of LSA topic modeling.
Computers 15 00360 g004
Figure 5. Non-negative matrix factorization method results using Word Cloud table.
Figure 5. Non-negative matrix factorization method results using Word Cloud table.
Computers 15 00360 g005aComputers 15 00360 g005b
Figure 6. Intertopic distance map of NMF topic modeling.
Figure 6. Intertopic distance map of NMF topic modeling.
Computers 15 00360 g006
Figure 7. Representation of the obtained topics through the intersection set.
Figure 7. Representation of the obtained topics through the intersection set.
Computers 15 00360 g007
Table 1. Number of videos and comments analyzed by keywords.
Table 1. Number of videos and comments analyzed by keywords.
KeywordsNumber of CommentsComment %Number of VideosVideo %
Covid Vaccine27,54029.284520.93
Vaccine23,91925.434822.33
Zero Dose Vaccine19,22720.442813.02
Vaccination11,80512.552712.56
mRNA43134.5852.33
Monkeypox27782.95136.05
HIV Vaccine21182.25136.05
Other Vaccines (BCG, Chickenpox, Flu, Polio, AIDS, Ebola, Hepatitis B)23722.523616.74
Total94,072100215100
Table 2. Results of the analysis conducted with the Hierarchical Dirichlet Process method.
Table 2. Results of the analysis conducted with the Hierarchical Dirichlet Process method.
Topic 1company (0.02), cure (0.02), wonder (0.02), money (0.02), court (0.02), smart (0.02), medication (0.02), help (0.01), pay (0.01), medicine (0.01), hurt (0.01), cured (0.01), fake (0.01), cancer (0.01), life (0.01)
Topic 2vaccine (0.03), child (0.02), vaccinated (0.02), autism (0.01), kid (0.01), person (0.01), risk (0.01), idiot (0.01), parent (0.01), actually (0.01), wrong (0.01), die (0.01), stupid (0.01), mean (0.01), unvaccinated (0.01)
Topic 3virus (0.05), body (0.03), immune (0.02), cell (0.02), mrna (0.02), protein (0.02), vaccine (0.01), disease (0.01), natural (0.01), immunity (0.01), dna (0.01), kill (0.01), spread (0.01), fight (0.01), response (0.01)
Topic 4covid (0.03), year (0.02), death (0.02), effect (0.01), sick (0.01), flu (0.01), vaccine (0.01), vaccination (0.01), vaccinated (0.01), life (0.01), shot (0.01), polio (0.01), hospital (0.01), booster (0.01), day (0.01)
Topic 5anti (0.07), vax (0.05), vaxxers (0.02), nice (0.01), video (0.01), interesting (0.01), explanation (0.01), change (0.01), afraid (0.01), experiment (0.01), mind (0.01), control (0.01), dangerous (0.01)
Topic 6covid (0.07), jab (0.02), heart (0.02), amp (0.02), conspiracy (0.02), died (0.01), comment (0.01), dead (0.01), lie (0.01), pfizer (0.01), attack (0.01), stupidity (0.01), brain (0.01), lost (0.01), theory (0.01)
Topic 7antivaxxers (0.03), propaganda (0.02), pox (0.01), smallpox (0.01), news (0.01), monkey (0.01), earth (0.01), history (0.01), argue (0.01), woman (0.01), sheep (0.01), flat (0.01), awesome (0.01), seriously (0.01), daughter (0.01)
Topic 8government (0.02), trump (0.02), medical (0.02), trust (0.02), science (0.02), doctor (0.01), public (0.01), pharma (0.01), research (0.01), medium (0.01), drug (0.01), health (0.01)
Topic 9blood (0.02), sad (0.01), literally (0.01), clot (0.01), poison (0.01), humanity (0.01), vaxxed (0.01), hilarious (0.01), dumb (0.01), sound (0.01), lab (0.01), herb (0.01), pure (0.01)
Topic 10good (0.02), mask (0.01), safe (0.01), feel (0.01), republican (0.01), job (0.01), hard (0.01), mandate (0.01), truth (0.01), sense (0.01), effective (0.01), listen (0.01), guy (0.01), hate (0.01)
Table 3. Results of the latent semantic analysis method.
Table 3. Results of the latent semantic analysis method.
Topic 1long, died, jab, life, disease, vaccination, effect, kid, body, death, good, child, virus, vaccinated, covid
Topic 2failure, stage, ivermectin, strain, bed, fatigue, update, version, fruit, positive, beast, scam, scurvy, citrus, covid
Topic 3hope, feel, doc, nice, stuff, informative, girl, idea, science, news, animation, information, explanation, covid, good
Topic 4vaccination, die, anti, risk, school, measles, covid, autism, unvaccinated, vaccinate, parent, good, kid, child, vaccinated
Topic 5antibody, infected, person, hiv, spread, protein, sick, body, immune, cell, corona, unvaccinated, good, virus, vaccinated
Topic 6hiv, disease, measles, autism, protein, body, immune, corona, cell, vaccinate, good, covid, parent, virus, child
Topic 7stupid, immune, cell, covid, vaccinate, parent, corona, good, body, child, vaxxers, kid, virus, vax, anti
Topic 8measles, feel, school, mrna, needle, cell, jab, poor, autism, covid, vaccinate, parent, virus, poison, kid
Topic 9covid, sick, blood, kid, amazing, dna, produce, vaccination, disease, vaccinated, immune, protein, mrna, cell, body
Topic 10effective, believe, company, medical, covid, virus, safe, pharma, money, poison, lie, science, body, government, trust
Table 4. Topics obtained from the non-negative matrix factorization method.
Table 4. Topics obtained from the non-negative matrix factorization method.
Topic 1infection, positive, against, mrna, tested, rate, month, die, symptom, flu, vax, pandemic, booster, death, covid
Topic 2heart, sick, bad, risk, problem, believe, health, flu, medical, safe, death, disease, vaccination, jab, effect
Topic 3doc, harm, nice, stuff, hope, informative, pretty, bad, girl, idea, news, information, explanation, job, good
Topic 4cured, dead, natural, herpes, herbal, protein, fight, cell, kill, spread, immune, cure, hiv, corona, virus
Topic 5infected, population, forced, fine, protected, measles, school, spread, die, risk, protect, against, sick, unvaccinated, vaccinated
Topic 6baby, protect, polio, school, adult, mother, vaccinating, autistic, risk, disease, measles, autism, vaccinate, parent, child
Topic 7wrong, dumb, vac, against, dislike, idiot, mom, mandate, die, movement, stupid, vaxx, vaxxers, vax, anti
Topic 8mom, care, polio, healthy, leave, needle, die, measles, poor, school, autism, poison, vaccinate, parent, kid
Topic 9cancer, part, natural, attack, poison, fight, antibody, blood, produce, dna, immune, protein, mrna, cell, body
Topic 10paid, evil, scientist, poison, truth, company, medical, speed, believe, money, pharma, lie, government, science, trust
Table 5. Comparison of subject modeling methods.
Table 5. Comparison of subject modeling methods.
MethodNumber of SubjectsCoherence Score (c_v)Study Duration (sn)
LSA100.539599
hLDA100.7197103,406
NMF100.5803280
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sert, U.; Ersoy, E.; Tosun, Ö.; Hatıpoğlu, I. Vaccine Perception on Digital Platforms: Topic Modeling of YouTube Comments. Computers 2026, 15, 360. https://doi.org/10.3390/computers15060360

AMA Style

Sert U, Ersoy E, Tosun Ö, Hatıpoğlu I. Vaccine Perception on Digital Platforms: Topic Modeling of YouTube Comments. Computers. 2026; 15(6):360. https://doi.org/10.3390/computers15060360

Chicago/Turabian Style

Sert, Uğurcan, Esra Ersoy, Ömür Tosun, and Irmak Hatıpoğlu. 2026. "Vaccine Perception on Digital Platforms: Topic Modeling of YouTube Comments" Computers 15, no. 6: 360. https://doi.org/10.3390/computers15060360

APA Style

Sert, U., Ersoy, E., Tosun, Ö., & Hatıpoğlu, I. (2026). Vaccine Perception on Digital Platforms: Topic Modeling of YouTube Comments. Computers, 15(6), 360. https://doi.org/10.3390/computers15060360

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop