Next Article in Journal
Coupling Coordination and Decoupling Dynamics of Land Space Conflicts with Urbanization and Eco-Environment: A Case Study of Jiangsu Province, China
Previous Article in Journal
Probabilistic Models for Military Kill Chains
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Recoding Reality: A Case Study of YouTube Reactions to Generative AI Videos

by
Levent Çalli
1,* and
Büşra Alma Çalli
2
1
Department of Data Science and Analytics, Faculty of Computer and Information Sciences, Sakarya University, Sakarya 54050, Türkiye
2
Department of Management Information Systems, Faculty of Business Administration, Sakarya University, Sakarya 54050, Türkiye
*
Author to whom correspondence should be addressed.
Systems 2025, 13(10), 925; https://doi.org/10.3390/systems13100925
Submission received: 20 September 2025 / Revised: 15 October 2025 / Accepted: 20 October 2025 / Published: 21 October 2025

Abstract

The mainstream launch of generative AI video platforms represents a major change to the socio-technical system of digital media, raising critical questions about public perception and societal impact. While research has explored isolated technical or ethical facets, a holistic understanding of the user experience of AI-generated videos—as an interrelated set of perceptions, emotions, and behaviors—remains underdeveloped. This study addresses this gap by conceptualizing public discourse as a complex system of interconnected themes. We apply a mixed-methods approach that combines quantitative LDA topic modeling with qualitative interpretation to analyze 11,418 YouTube comments reacting to AI-generated videos. The study’s primary contribution is the development of a novel, three-tiered framework that models user experience. This framework organizes 15 empirically derived topics into three interdependent layers: (1) Socio-Technical Systems and Platforms (the enabling infrastructure), (2) AI-Generated Content and Esthetics (the direct user-artifact interaction), and (3) Societal and Ethical Implications (the emergent macro-level consequences). Interpreting this systemic structure through the lens of the ABC model of attitudes, our analysis reveals the distinct Affective (e.g., the “uncanny valley”), Behavioral (e.g., memetic participation), and Cognitive (e.g., epistemic anxiety) dimensions that constitute the major elements of user experience. This empirically grounded model provides a holistic map of public discourse, offering actionable insights for managing the complex interplay between technological innovation and societal adaptation within this evolving digital system.

1. Introduction

By the mid-2020s, AI-generated video has moved decisively from a research curiosity to a mainstream phenomenon. Advances in generative models and user-friendly software have dramatically lowered the barrier to creating realistic synthetic video and audio [1]. As one industry review observes, such content “has become increasingly widespread”—“seamlessly woven into our cultural fabric”, reshaping communication and democratizing access to media production [1,2]. In practice, this means that millions of short “deepfake” clips (with faceswaps, voice clones, virtual avatars, etc.) now circulate on platforms like TikTok, YouTube and Instagram, often produced by commercial AI tools. The resulting ubiquity of synthetic media on social feeds has amplified the potential reach of any single video, making generative AI a routine part of everyday content creation and consumption.
These technological shifts have not gone unnoticed by the public. Large-scale surveys report that awareness of AI-manipulated video is now nearly ubiquitous and that citizens express rising concerns about its implications for privacy, journalism, and democracy. For example, a national survey in late 2024 found that 58% of U.S. adults worried about AI’s impact on politics and 53% were concerned about its effect on news media [3]. Notably, 41% of respondents said AI does more harm than good in protecting personal information. Earlier polling by Pew Research similarly showed that a clear majority of Americans perceive altered videos as confusing and misleading, and favor restrictions on deceptive media (e.g., ~77% supported limiting misleading “deepfake” videos) [4]. In short, public opinion data indicate widespread recognition of synthetic video and anxiety about its potential misuse—especially in elections, fraud schemes, or privacy violations. In response, multistakeholder efforts now push for practical solutions—provenance metadata, watermarking, and user-facing verification—to help people judge authenticity [2,5].
Meanwhile, computational analyses of social media show that everyday reactions to AI-generated video are complex and varied. One recent study of Reddit discussions used topic modeling to identify two broad themes: creative or entertainment uses of deepfakes versus legal, ethical and personal-security concerns [6]. Xu et al. (2025) [6] found that sentiment analysis of these posts revealed a near-even split: approximately 47.0% conveyed positive or curious attitudes—such as fascination with novel voice-synthesis effects—while 36.8% reflected negative sentiments, including fear, distrust, or demands for stricter regulation. Observers note that many users engage with AI videos as satire or memes. Indeed, digital creators have embraced “deepfake” techniques to produce humorous parodies of celebrities and politicians, taking advantage of social media’s remix culture; as one commentary puts it, “comedic sensibilities can flourish within an online environment that welcomes remixing and sampling” [7]. At the same time, the blending of real and fake content provokes an uncanny sense of uncertainty. Users frequently ask whether a video is genuine or synthetic, and platforms are beginning to respond: major social networks now encourage explicit AI-provenance labels or metadata to help audiences assess authenticity [1].
Previous studies, such as Xu et al. (2025) [6] analysis of Reddit, have shown public discourse on AI videos is polarized between creative uses and ethical concerns. However, Reddit’s “meta-discussion” format is often limited to a more tech-savvy audience. This study shifts its focus to YouTube to capture the immediate, organic reactions of a broader public. As the native platform for video, reactions on YouTube occur directly under the content, allowing for a more naturalistic analysis. Understanding these immediate reactions is critical for contextualizing developing technology policies.
At the international level, organizations from the UN to technical standards bodies are also mobilizing. In July 2025 the ITU (a UN agency) released a report urging social-media platforms to deploy advanced AI detectors and digital verification tools for images and video [8]. Likewise, the ITU’s newly formed AI and Multimedia Authenticity Standards Collaboration—involving ISO/IEC, industry leaders (Adobe, Microsoft, Shutterstock, etc.), and organizations like the Content Authenticity Initiative (C2PA)—stresses that as AI media proliferates, “digital content…must also be traceable, trustworthy, and ethically produced”, necessitating global coordination on standards [2].
Taken together, these technical, social, and legal developments underscore that AI-generated video is no longer a marginal phenomenon. The convergence of cutting-edge generative tools, high public salience, and urgent regulatory efforts means that the field now demands a multidisciplinary response. Researchers in computer science, HCI, communication studies, and related fields are therefore investigating AI video from different perspectives—developing detection methods, studying user perceptions, and advising policy—to harness its creative potentials while guarding against harms [9]. Large-scale public surveys and expert polling report rising awareness and concern about manipulated media—including anxiety about privacy, trust in news, and political/financial harms—underscoring the social salience of synthetic clips [9,10]. Computational studies of in-platform conversations show that user experience and perceptions are multidimensional—encompassing esthetic appreciation and humor as well as epistemic worry and calls for regulation—and that topic-level patterns can be recovered at scale using topic modeling and sentiment analysis [6].
This article fills an important gap by using a large-scale, mixed qualitative–quantitative analysis of public discourse about AI-generated videos. Rather than reducing user experience to a single score, the study conceptualizes everyday responses as multidimensional discourse—encompassing evaluative judgments (trust or distrust), esthetic reactions (amusement or uncanny discomfort), practice-oriented behaviors (remixing and memetic use), and governance concerns (provenance and platform policy)—and uses these patterns to build a clear, reproducible taxonomy that can guide both research and policy.
Empirical and theoretical work motivates this multi-faceted framing: consumer research shows AI encounters take varied experiential forms rather than a single attitudinal dimension [11], and experimental work on AI-generated artifacts highlights distinct esthetic and emotional responses that matter for interpretation [12]. At the same time, large-scale studies of misinformation and information diffusion demonstrate that false or misleading content spreads differently from true content and therefore that epistemic concerns are central to public reaction [13,14]. Research on why content is shared underlines the role of emotional and social drivers in virality—an important mechanism when users encounter striking synthetic media [15].
Finally, legal and policy scholarship has foregrounded provenance, platform responsibility, and the distinctive regulatory challenges posed by deepfakes and synthetic media [16], while methodological critiques emphasize that lab and survey methods can miss the emergent conversational dynamics of in-platform discourse and thus that ecological approaches to naturalistic data are necessary [17].
To address the specific gap in the literature, this study proposes a multi-faceted framework and formulates four key research questions to systematically analyze the user experience and perceptions of AI-generated videos:
  • (RQ1) What do ordinary users say they think about AI-generated videos on major public platforms?
  • (RQ2) Can these natural responses be grouped into a small set of recurring thematic categories?
  • (RQ3) Which themes dominate everyday discourse and which themes are comparatively rare?
Consequently, guided by these research questions, this study addresses a clear literature gap by systematically mapping how users respond to AI-generated videos. It contributes to the existing body of knowledge by developing a reproducible taxonomy and examining the key elements of user experience. In addition, the study presents an initial exploratory analysis that lays the groundwork for future causal research and policy-oriented interventions. Ultimately, the findings demonstrate that public discourse encompasses both enthusiasm and profound concerns, underscoring the need for closer scholarly and regulatory attention.

2. Literature Review

The rapid proliferation of generative AI has catalyzed a significant body of research aimed at understanding its technical capabilities, societal implications, and public reception. While scholarly inquiry is expanding, a notable empirical gap persists in the large-scale, naturalistic analysis of spontaneous public discourse surrounding AI-generated video content. Existing literature provides valuable foundational work but often focuses on specific platforms, commercial contexts, or relies on experimental methods, leaving the broader landscape of organic user reactions on video-native platforms underexplored. This review synthesizes key studies, as outlined in Table 1, to identify the precise gaps this research aims to address.
Initial efforts to map public discourse have successfully utilized computational methods to analyze text-based social media platforms. For instance, the work of [6] on Reddit provides a pioneering topic model of deepfake discussions, identifying a fundamental bifurcation in public sentiment between creative applications (Culture and Entertainment) and normative concerns (Legal and Ethical Impacts). Their analysis reveals a near-even split between positive and negative sentiment, highlighting the polarized nature of the conversation. However, this study’s focus on Reddit—a platform whose user base is often more technically inclined—limits the generalizability of its findings to the broader public. Furthermore, its reliance on a binary positive-negative sentiment classification, while useful, may obscure the more nuanced affective, cognitive, and behavioral reactions that our study seeks to uncover. This points to a need for research situated on a more general, video-native platform like YouTube, employing a more granular analytical framework.
Other research has investigated user reactions within specific, often commercial, domains. Studies by Seo et al. (2025) [18] on AI-generated tourism videos and Belanche et al. (2025) [19] on AI-generated images in service contexts demonstrate that perceptions of authenticity, trust, and realism are critical moderators of consumer attitudes and behavioral intentions. These studies compellingly show that context matters, with factors like hedonic value and consumer involvement shaping persuasive outcomes. The limitation of this research stream, however, is its reliance on either domain-specific applications (e.g., tourism, marketing) or controlled experimental stimuli. While valuable for isolating variables, this approach does not capture the full spectrum of unsolicited, organic commentary that emerges in naturalistic digital environments, where user engagement is not exclusively driven by commercial intent. A clear gap remains in understanding the spontaneous, multifaceted discourse that occurs outside of these structured contexts.
From a methodological perspective, the tools for large-scale analysis of social media corpora are becoming increasingly sophisticated. Systematic reviews like that of Gupta et al. (2024) [20] have used topic modeling to map the scholarly literature itself, providing a valuable macro-level taxonomy of academic research clusters but not a direct analysis of public discourse. Concurrently, methodological studies such as that by Sun et al. (2025) [21] demonstrate the feasibility of applying topic modeling techniques like LDA to short-text comment corpora. This work offers practical guidance on best practices, such as combining automated topic discovery with LLM-assisted labeling and human validation. Smaller-scale qualitative analyses, such as that by Kaya (2025) [22], further confirm the richness of YouTube comments, revealing a bifurcation between amusement and fear.
Table 1. Literature Review.
Table 1. Literature Review.
Author(s)Research FocusMethodology
& Data
Key Findings
[6]Mapping public discourse and sentiment polarity regarding deepfakes on Reddit.Method: LDA topic modeling and sentiment analysis.
Data: ~17,720 Reddit posts and comments.
Identified a clear thematic split between creative/entertainment uses and ethical/legal concerns.
Confirmed a highly polarized public sentiment (~47% positive vs. ~37% negative).
[18]Investigate how AI-generated tourism videos are perceived and whether perceptions affect tourist intentions.Method: Mixed methods using experimental stimuli and survey responses.
Data: Controlled datasets of user reactions to specific prompts.
Highlighted the crucial role of perceived authenticity, trust, and context (e.g., hedonic vs. utilitarian) in shaping user attitudes.
Demonstrated that transparency is a key moderator of persuasion.
[19]Compare user reactions to AI-generated vs. human images in service contexts; measure attitudes and behavioral intent.Topic modeling is used to extract emergent themes from open-ended responses (authors combine LDA with confirmatory tests).
Experimental stimuli + open-ended user responses; empirical dataset
AI-generated visuals can be as persuasive as real images in some hedonic, high-involvement contexts; however, transparency and perceived authenticity are decisive moderators.
Topic analysis surfaces concerns about trust, esthetics and perceived value.
[21]Methodological demonstration of a pipeline for extracting and labeling topics from YouTube comments using LDA and LLMs.Method: LDA topic modeling combined with LLM-assisted labeling for interpretation.
Data: A YouTube comment dataset on AI-related topics.
Provided a methodological proof-of-concept, showing that coherent topics can be extracted at scale.
LDA uncovers coherent word clusters
Human validation remains important for short, noisy comments.
[22]Small-scale qualitative analysis of user comments on specific deepfake videos on YouTube.Method: Manual content analysis and descriptive sentiment scoring.
Data: A limited sample of YouTube comments from selected videos.
Identified a core tension between user amusement/admiration and fear/anxiety regarding deepfakes.
YouTube as a rich site for analyzing organic user reactions.
[20]Broad mapping of academic literature on generative AI.Large-scale topic modeling (BERTopic) of scholarly articles.
Data: A bibliographic corpus of ~1319 academic records from Scopus.
Provides a macro-level taxonomy of research clusters (e.g., images, text, ethics, detection).
Methodological takeaway: while LDA remains common, transformer-based, contextual topic methods and human-in-the-loop labeling are increasingly used to handle short social texts.
Collectively, the existing literature establishes a clear trajectory: computational methods can effectively map public discourse, key psychological constructs like trust and authenticity are central to user perception, and the methodologies for analyzing this data are maturing. However, a comprehensive study that integrates these threads is still missing. There remains a significant need for a large-scale, empirically grounded framework that systematically organizes the spontaneous public reactions to AI-generated video on a major video-native platform. While previous work has identified broad themes or analyzed specific contexts, this study fills a crucial gap by providing a reproducible, multidimensional taxonomy that maps the complex interplay of esthetic, socio-technical, and ethical concerns as they are articulated by a global audience in a naturalistic setting.

3. Conceptual Model

Our conceptual model is anchored in the ABC framework of attitudes (Affective, Behavioral, and Cognitive) proposed by Solomon et al. (2013) [23]. This model posits that viewers engage in a multidimensional appraisal of AI-generated content, eliciting affective responses (e.g., unease, amusement), prompting cognitive judgments about authenticity and trust, and shaping behavioral intentions, such as sharing or advocacy. While these internal appraisals are not directly observable, they manifest in public discourse. Therefore treats the thematic structures within YouTube comments as empirical indicators of these underlying processes. This framework provides a structured lens to systematically organize these public reactions, serving as the bridge to our three core analytical themes: AI-Generated Content and Esthetics, Socio-Technical Systems and Platforms, and Societal and Ethical Implications.
Building upon this ABC framework, we propose an intermediate hierarchical classification that organizes the various themes of public discourse into three higher-order, policy-relevant categories.

3.1. ABC Framework of Attitudes

Users’ acceptance and use of AI-generated videos are shaped by attitudinal components—affective (emotions and feelings), behavioral (actions and intentions), and cognitive (beliefs and judgments)—consistent with the ABC model of attitudes [23]. These components are themselves influenced by prior experience with the technology, prevailing social norms, and media representations, such that systematic examination of affective, behavioral and cognitive responses is necessary to anticipate consumers’ feelings, emotional reactions, and interactions with AI-generated video technologies [23].

3.1.1. Affective Responses to AI-Generated Videos

AI-generated videos evoke both positive and negative emotional reactions. Public commentary collected by [24] emphasises reactions to the eerily realistic appearance of synthetic videos, concerns about rapid and unexpected technological development, demands for regulation and public education, problems related to watermarking and provenance, and fears regarding misuse. While AI holds evident promise, scholarship has highlighted an underresearched “dark side” encompassing adverse emotional and psychosocial outcomes [25,26]. Prior research reports a spectrum of negative affective outcomes across AI application contexts, including perceptions of identity threat [27], user dissatisfaction Rana et al. (2022) [28], demotivation [29], negative attitudes toward AI firms [26], technical exhaustion [30], anger and confusion [31], diminished interpersonal interaction and creativity [29], reduced productivity [32], and difficulties in comprehension [11]. Empirical evidence further indicates that scenarios in which AI is framed as “defeating” humans intensify negative emotional responses such as fear, anger, contempt, and despair [26]. At the same time, experimental and observational work on AI-produced art suggests that algorithmic artifacts can elicit emotional responses and impressions of intentionality, although human-made artworks typically produce higher appreciation, meaningfulness, and positive affect [12]. Foundational work on perceptual responses to near-human representations—the “uncanny valley” hypothesis—remains relevant for interpreting affective reactions to high-fidelity synthetic faces and voices [33,34].

3.1.2. Behavioral Consequences and Intentions

Behavioural responses to AI-generated videos are strongly conditioned by perceptions of authenticity and the perceived intentions of content creators. Consumers may adopt sceptical or defensive stances as synthetic video prevalence increases, leading to reduced engagement and lower willingness to support or share such content [35]. Perceived manipulative intent and epistemic scepticism diminish the persuasive power of emotionally engaging content and can depress behavioural outcomes such as trust, sharing, and purchase intent [36,37]. Domain-specific studies corroborate these dynamics: analyses of AI-generated tourism videos report that authenticity and trustworthiness shape attitudinal responses and that destination attractiveness can mitigate negative attitudes to some extent, whereas poor video quality undermines conversion of favourable attitudes into behavioural intentions [18]. In the marketing context, AI-generated sponsored vlogs show that video quality and content usefulness contribute positively to information adoption, yet creator reputation functions as the strongest trust bridge influencing consumer actions [34]. Research on advertising suggests that human–AI collaborative portrayals (human alongside AI characters) generate more favourable audience attitudes and lower AI-related anxiety than advertisements featuring only AI characters [38]. At the organisational level, adoption and managerial behavioural intentions depend on enabling conditions and institutional support; managers’ personal concerns regarding ethical risks, job security, and loss of control can impede diffusion unless adequately addressed [39,40]. Psychological needs such as relatedness and autonomy further modulate positive behavioural orientations toward AI in creative domains [41]. More broadly, ethical perceptions (e.g., concerns about misinformation, privacy, accountability, and control) elevate perceived risk, erode trust, and thereby weaken adoption intentions for AI-generated content tools [42].

3.1.3. Cognitive Evaluations: Authenticity, Trust and Epistemic Concerns

Cognitive judgements concerning the authenticity, provenance and trustworthiness of AI-generated videos critically shape user responses. The increasing realism of generative models raises epistemic anxieties about evidence and veracity, prompting calls for provenance, watermarking, and regulatory measures [24,37]. Studies of perceived authenticity demonstrate context dependency: in political contexts, authenticity is often assessed against historical factuality, whereas in fictional or speculative contexts viewers inquire into the extent of technological fabrication and the ontological status of depicted characters [43]. Empirical work further indicates that ethical frameworks, robust data security, and AI literacy strengthen consumer trust in AI-generated content and increase persuasive efficacy, whereas algorithmic opacity, bias, and accountability deficits undermine trust [34,42]. Research in e-commerce contexts identifies service quality, security, and design aesthetics as central to user experience with AI-generated product representations, underscoring the cognitive role of perceived functionality and interface cues in shaping credibility assessments [44]. Collectively, these findings highlight that cognitive appraisals of veracity and institutional safeguards are decisive for acceptance and downstream behavioural outcomes.
Consequently, understanding user experience and acceptance of AI-generated videos requires an integrative perspective that examines the interplay of affective, behavioral, and cognitive dimensions. As summarized in Figure 1, these dimensions consist of: affective responses, which capture immediate emotional reactions; behavioral outcomes, which reflect subsequent actions and intentions; and cognitive evaluations, which encompass judgments on authenticity, trust, and ethics. Collectively, these components constitute a comprehensive framework that not only explains variations in user perception but also illuminates the mechanisms by which attitudes are shaped, reinforced, or challenged. This attitudinal lens is therefore critical for anticipating patterns of engagement with emerging AI video technologies and for informing strategies that promote responsible design, regulation, and adoption.
Existing literature establishes that affective reactions, behavioral intentions and cognitive appraisals interactively determine consumer responses to AI-generated video content. Nevertheless, there remains a relative paucity of empirical studies focused explicitly on user experience with AI-generated videos across heterogeneous domains; extant work tends to concentrate on adjacent contexts (e.g., AI art, sponsored vlogs, tourism videos, e-commerce screenshots) or on broader theoretical treatments of AI’s societal impacts [12,18,34,42]. Moreover, while ethical and governance issues have been increasingly foregrounded, systematic investigations linking perceptual cues (e.g., fidelity, watermarking, creator reputation) to attitudinal subcomponents and concrete behavioral outcomes remain limited. These limitations indicate the necessity of domain-sensitive research, employing a quantitative topic modeling approach followed by qualitative interpretation, that integrates affective, behavioral and cognitive measures to explicate acceptance processes for AI-generated video artifacts. To address this gap, this study operationalizes the very methodology the literature calls for: a large-scale topic modeling of naturalistic user comments, followed by a theory-driven qualitative interpretation. This approach provides a powerful, bottom-up framework for discovering the latent thematic patterns in user discourse before mapping them onto established theoretical constructs.
Operationalizing an intermediate hierarchical classification from topic-model outputs is valuable because probabilistic topic models recover repeatable, bottom-up clusters of co-occurring terms that summarize the dominant semantic patterns in large comment corpora, making them a natural empirical substrate for mapping onto theory-driven categories [45,46]. When topic discovery is combined with human interpretability checks and structural extensions that leverage document metadata, the resulting topics become interpretable, policy-relevant indicators rather than purely statistical artifacts [47,48]. Methodological reviews further show that a mixed-methods workflow (automated topic extraction → manual labeling → external validation) closes an important gap between large-scale text mining and robust social-science inference, improving construct validity and reproducibility [49,50]. Finally, computational social-science applications demonstrate that topic-based signals can act as early, scalable markers of emergent public concerns—helping researchers and policymakers prioritize issues, design targeted interventions, and contribute new empirical evidence to a growing academic literature [48,51].
The proposed structure serves as an interpretive bridge between the emergent thematic clusters identified in the data and the theory-driven channels of the ABC model. Concretely, these categories are:

3.2. Socio-Technical Systems and Platforms

This category captures discussions focused on the AI technologies themselves—the platforms, tools, and competitive dynamics that shape their development and accessibility. The theoretical basis for this category is socio-technical theory, which holds that technology and its social context co-constitute each other [52,53]. It encompasses discourse related to platform affordances such as recommendation algorithms, ranking dynamics, and moderation practices [54,55]. Conceptually, this theme captures expressions where users combine cognitive judgments about platform governance with behavioral intentions (e.g., to report or share content) [56]. Because algorithmic systems materially alter user experience and belief formation [14], we posit that discourse within this category can serve as a direct indicator of platform-level issues and point to testable policy levers [13].

3.3. AI-Generated Content and Esthetics

This category addresses the creative outputs and style of generative AI, and is informed by theories of human–AI creativity and aesthetics [57]. It is intended to index the perceptual and stylistic cues in the media—such as visual fidelity, lip-sync errors, and other "uncanny" artifacts—that are known to trigger immediate affective responses [58,59]. The behavioral consequences of such emotional engagement are well-established; emotionally charged content tends to spread farther and faster online [15,60].
A key premise of this category is the challenge of human perception; experimental evidence shows that people often fail to reliably detect synthetic media [61,62]. This perceptual limitation suggests that spontaneous user reactions represent a uniquely valuable source of data. Since conscious detection is unreliable, unfiltered aesthetic and affective critiques can offer a practical proxy for emotional-cognitive processing. We therefore propose that the analysis of such reactions is crucial for predicting real-world circulation patterns and for understanding the cognitive mechanisms that make audiences susceptible to misinformation, even when it is implausible or fact-checked [63].

3.4. Societal and Ethical Implications

This category is framed to capture broad, normative concerns regarding the societal consequences of generative AI, including issues of privacy, consent, labor impacts, and accountability [16,64]. Its theoretical foundation lies in established AI ethics and social impact frameworks. Themes within this category are expected to operate through cognitive and normative channels, wherein individuals evaluate harms, weigh rights, and propose institutional remedies [65]. These concerns closely mirror established principles in the AI ethics literature, such as the FATE (Fairness, Accountability, Transparency, Ethics) framework [66]. Consequently, the conceptual utility of this category rests on the premise that the prevalence of such themes can serve as a real-time barometer of public policy preferences and ethical priorities [67].
The development of this three-tiered framework is both empirically derived and theoretically grounded. Unlike simpler binary classifications (e.g., positive vs. negative sentiment) that would obscure the multifaceted nature of audience reactions, our model provides a more comprehensive and explanatory structure. It is empirically driven because these three macro-categories emerged as the most logical way to organize the 15 distinct topics identified through our bottom-up LDA analysis. It is theoretically meaningful because it captures the full spectrum of engagement, scaling from the micro-level of direct interaction with the media artifact (Esthetics), to the meso-level of the infrastructure delivering it (Platforms), and culminating in the macro-level of its perceived impact on society (Ethics). This hierarchical structure thus offers a robust and nuanced lens for interpreting public discourse that is superior to flatter, less structured analytical approaches.
The conceptual model presented in this study organizes public discourse into a multidimensional framework. It posits that user reactions are not monolithic but are structured across three primary thematic domains. Furthermore, within each domain, responses can be understood through the distinct psychological channels of the ABC model of attitudes: Affective, Behavioral, and Cognitive. Table 2 provides a conceptual map of this integrated framework. It systematically organizes public discourse by mapping our three thematic categories—AI-Generated Content and Esthetics, Socio-Technical Systems and Platforms, and Societal and Ethical Implications—against the affective, behavioral, and cognitive dimensions of attitudes. This structure serves as a heuristic for analyzing the complex and often overlapping layers of public reactions.
Table 3 presents this integrated framework as a conceptual map, serving as a heuristic to delineate the various facets of public attitude toward generative AI videos. The primary strength of this model lies in its matrix structure, which integrates our three thematic categories with the classical ABC model of attitudes. This approach powerfully demonstrates that audience reactions are not one-dimensional; rather, each thematic domain contains distinct affective, behavioral, and cognitive layers, adding significant analytical depth.
Furthermore, these thematic categories are organized in a logical progression, scaling from the micro-level to the macro-level. This conceptual hierarchy moves from the particular to the general, reflecting an increasing level of abstraction:
  • The analysis begins at the most immediate level: AI-Generated Content and Esthetics, focusing on direct, personal interactions with the media artifact itself.
  • It then progresses to the intermediate level of Socio-Technical Systems and Platforms, which addresses reactions to the technologies and corporate actors behind the content.
  • Finally, it culminates at the macro-level with Societal and Ethical Implications, concerning the abstract, wide-ranging impacts of technology on society, culture, and truth.
This deliberate hierarchical structure brings order and clarity to the analysis, allowing for a systematic examination of public attitudes.

4. Methodology

4.1. Topic Modeling

Topic modeling is an unsupervised statistical approach for discovering the hidden thematic structure within large collections of documents. The core idea is to represent each document as a mix of various topics, where each topic is defined as a probability distribution over words [45,68]. This method produces concise, low-dimensional representations of text that are valuable for tasks like corpus exploration, trend detection, information retrieval, and summarization, while retaining a measure of uncertainty through its probabilistic framework [46,69]. In practice, topic models transform high-dimensional text into interpretable thematic mixtures suitable for visualization and further analysis. Thanks to efficient algorithms like variational inference and Gibbs sampling, they can scale to massive datasets and be extended to include factors such as metadata, time, or topic correlations [47,70,71].
Among the various approaches to topic modeling, Latent Dirichlet Allocation (LDA) is the most foundational and widely used method. LDA is a generative probabilistic model that imagines a process by which the documents in a corpus are created. This process assumes that each document is formed from a random mixture of topics, and each word in that document is generated by first choosing a topic from this mixture and then choosing a word from that topic’s corresponding word distribution [68].
For this study, LDA was intentionally selected as the primary analytical tool. This principled methodological choice is driven by the exploratory nature of the research, which prioritizes human interpretability, methodological transparency, and the direct validation of discovered themes. The core strength of LDA lies in its probabilistic, generative framework, which produces directly inspectable outputs: topic-word distributions and per-document topic mixtures [68]. This transparency stands in stark contrast to “black box” models (e.g., BERTopic, Top2Vec) that operate through multi-step pipelines. These methods first convert text into high-dimensional vector representations and then apply complex clustering algorithms, a process that can obscure the direct reasoning behind topic formation. In contrast, LDA’s transparent, word-based outputs enable domain experts to qualitatively validate, label, and interpret the semantic coherence of each topic. This advantage is critical, as foundational studies have demonstrated that purely statistical metrics of model fit do not always correlate with human judgments of topic quality. As Chang et al. (2009) [46] critically showed, models that are statistically “better” can produce topics that are less meaningful to humans, reinforcing the need for methods that prioritize interpretability in applied research.
Furthermore, LDA’s reduced dependency on large, pre-trained models enhances reproducibility, lowers computational costs, and mitigates the risk of domain-shift artifacts—a significant concern when applying general-purpose embeddings to specialized and noisy datasets like user-generated content [72]. These qualities make it a particularly robust choice for the small and heterogeneous nature of the data analyzed in this study.
The continued efficacy of LDA for extracting actionable insights from user-generated content is well-documented in recent literature across diverse domains. For instance, recent studies have successfully employed it to analyze social media discussions on the gig economy [73], identify service quality issues from airline passenger complaints [74], discover user adoption factors from mobile banking app reviews [75], explore user experience concerns in Metaverse games [76], and analyze user perceptions of e-scooter services [77].
In summary, due to its superior interpretability, methodological transparency, and robust performance in exploratory contexts—qualities emphasized in methodological reviews of social data analysis—LDA [45] emerges as the most suitable and effective approach for the objectives of this investigation.
More formally, LDA is a three-level hierarchical Bayesian model where the corpus-level parameters (α and η) provide priors for the document-level topic proportions (θ) and the vocabulary-level topics (β). The joint distribution of all latent and observed variables, which mathematically defines the generative process for an entire corpus, can be expressed as follows [78]:
p β 1 : K , θ 1 : D , z 1 : D , w 1 : D = i = 1 K p β i η d = 1 D p θ d α n = 1 N d p z d , n θ d p w d , n β z d , n
In this formulation, β 1 : K ” represents the “K” topics (word distributions), “ θ 1 : D represents the topic proportions for the “D” documents, “ z 1 : D represents the topic assignments for each word in each document, and w 1 : D are the observed words themselves. The main inferential task in LDA is to compute the posterior distribution of the hidden variables β i , θ , z ” given the observed documents “w”. This “reverses” the generative process to uncover the latent thematic structure that most likely produced the text collection [45].
While this inference process provides a mathematical pathway to discovering topics, determining their actual quality and relevance presents a critical challenge, as standard statistical measures of model fit do not always align with human judgments of topic quality.
Therefore, complementing these diagnostics with coherence metrics and human-centered validation is crucial for obtaining meaningful results [46,79,80,81]. A cornerstone of topic modeling is LDA, a hierarchical Bayesian model that provides a generative story for how a document is created [45]. In this model, each document’s topic proportions are drawn from a Dirichlet distribution, and each word in the document is generated by selecting a topic and then drawing a word from that topic’s specific vocabulary distribution. The strength of LDA lies in its clear probabilistic framework, which allows for robust uncertainty quantification and principled extensions. Its proven effectiveness across numerous domains, supported by a vast literature on scalable algorithms and diagnostics, has cemented its importance [45,71,82]. Beyond its direct use, LDA serves as a modular foundation for advanced models that incorporate time, covariates, or topic relationships to answer more complex research questions [70,78,83]. For reliable and meaningful outcomes, users of LDA must diligently tune its hyperparameters, make informed choices about priors, and rigorously validate the coherence of the resulting topics [80,81,82].

4.2. Data Extraction

This study analyzes a corpus of YouTube comments derived from 22 videos published during May and June 2025, focusing on discussions of leading AI video-generation systems, particularly the new class of models exemplified by Google’s Veo 3 and Kling.ai. Basic metadata of the analyzed videos are provided in Table Appendix A. The selection process was governed by several criteria to ensure the dataset’s topical relevance and analytic utility. We prioritized content with high viewership and substantial comment volume to capture widespread public discourse. To ensure this discourse was multifaceted, the corpus was curated to represent a spectrum of genres and sources, creating a ‘case set’ that includes: (a) Official corporate demonstrations from Google’s own channel (e.g., ‘Meet Veo 3, our latest video generation model,’ ID: ODyROOW1dCo); (b) Creative and artistic showcases by independent creators (e.g., ‘Bigfoot-Born to be Bushy,’ ID: j4CT5dZe8ZA); (c) Technical reviews and competitive comparisons from tech-focused channels (e.g., ‘Veo 3 vs. Kling 2.1 Master Systems 13 00925 i001,’ ID: gwhSPf3S89M); and (d) Journalistic explorations of the technology’s impact by established media outlets (e.g., ‘We Tested Google Veo and Runway…’ by The Wall Street Journal, ID: US2gO7UYEfY). A visual summary of this video dataset is provided in Figure 2.
Following this content-based curation, and to maintain linguistic homogeneity for downstream topic modeling, the analysis was restricted to English-language videos and comments. This methodology yielded an initial collection of 15,795 comments; following preprocessing and relevance filtering, the final analytic dataset comprised 11,418 comments, representing a retention rate of approximately 72.3%.
It is important to note that this highly specific selection of videos, centered on a major corporate product launch within a narrow timeframe, constitutes a case study approach. This method is particularly well-suited for capturing an in-depth snapshot of public discourse during a critical event of high technological and social salience [84]. Rather than seeking broad generalizability, the goal is to achieve analytical depth by examining the formation of public attitudes and narratives as they unfold in a specific, high-impact context [85]. This approach allows for a rich, contextualized understanding of the mechanisms behind public sense-making in response to new generative AI technologies.
The collection of this data adhered to strict ethical guidelines. As the study exclusively analyzed publicly available user comments and did not involve interaction with human subjects, it was considered exempt from Institutional Review Board (IRB) review. To protect user privacy, all usernames were removed from the dataset prior to analysis, and any quoted comments were paraphrased to prevent deanonymization, in full compliance with YouTube’s Terms of Service.

4.3. Data Preparation and Cleaning

The preprocessing pipeline was implemented as a staged and reproducible workflow to transform the raw YouTube comments into a suitable format for topic modeling. The process, applied to the initial collection of 15,795 comments, is detailed below:
  • Tokenization: The raw text of each comment was first segmented into individual tokens based on whitespace splitting.
  • Text Normalization: A series of deterministic normalization steps were applied to each token. This included converting all text to lowercase, removing punctuation, trimming extraneous whitespace, and eliminating tokens with fewer than two characters.
  • Stopword Filtering: We applied a comprehensive stopword removal process using two main resources: a standard English stopword list (e.g., ‘the’, ‘is’, ‘a’, ‘and’) and a second, researcher-constructed list of terms that were frequent but thematically irrelevant to the study’s focus (e.g., ‘youtube’, ‘video’, ‘comment’).
  • Numeric Token Conversion: All tokens consisting of digits were converted to their English word equivalents (e.g., “3” became “three”) using a standard number-to-word library. These newly generated word-form numbers were then re-filtered against the stopword lists.
  • Content-Based Filtering and Bias Assessment: A critical step was to exclude comments that lacked sufficient semantic content for robust topic modeling. Therefore, any comments with fewer than three remaining tokens after the preceding stages were removed from the dataset. This rule is standard practice to filter out very short, non-substantive comments (e.g., “lol”, “wow”, or single emojis) that contribute noise rather than thematic signal. This filtering process resulted in the exclusion of 4377 comments. Consequently, our final analytic dataset comprised 11,418 comments, representing a retention rate of approximately 72.3% from the initial corpus. While this step is necessary to improve model quality, we acknowledge that it may introduce a bias against purely affective, low-effort reactions and focus the analysis on more deliberative comments.
  • Domain-Specific Rooting: To consolidate morphological and lexical variants, tokens were mapped to curated root forms. This process grouped related words to capture their shared semantic core (e.g., mapping ‘generation’, ‘generated’, and ‘generative’ to a common root).
  • N-gram Construction: To capture meaningful multi-word phrases (collocations) that often represent a single concept (e.g., “uncanny valley”, “will smith”), the token set for each comment was expanded. Using co-occurrence statistics, we identified and included significant bigram and trigram combinations alongside the original unigrams.
Finally, a dictionary mapping unique tokens to IDs and a bag-of-words corpus representing each comment as a vector of token counts were constructed from the fully processed text, serving as the input for the LDA model.

4.4. Determining the Optimal Number of Topics in LDA

Selecting the number of topics is a central modeling decision in LDA, as it governs both statistical fit and semantic interpretability. Several diagnostic parameters are typically employed, including topic coherence, which quantifies semantic relatedness among top-ranked words, commonly through normalized pointwise mutual information (NPMI):
N P M I w i , w j = l o g P w i , w j P w i P w j l o g P   w i , w j ,
aggregated across word pairs to form coherence scores [80,81]. Second, diversity evaluates lexical distinctiveness across topics, defined as;
D i v e r s i t y N = U k = 1 K   T o p N ( t k ) N . K ,
where T o p N ( t k ) denotes the top-N words in topic tk, thus capturing the degree of overlap [86,87]. Each captures a different dimension of model quality, and none alone is sufficient for reliable evaluation.

5. Findings and Results

As part of the modeling process, all analyses and topic models were executed in Google Colab. The preprocessing and modeling pipeline used Python (Version 3.12.12) and gensim’s LdaModel. The final model reported here used 15 topics, 40 passes, 150 iterations, an alpha of 0.1, an eta of 0.01, a chunksize of 2000, and a random state of 42. Model development was iterative and computationally intensive: we trained and compared many alternative topic counts, preprocessing variants and random seeds, computed diagnostics (coherence c_v, topic diversity) and then refined the labels through manual inspection of top terms and representative comments.
The selection of k = 15 rests on a combined appraisal of these diagnostics. In this study, k = 15 achieves coherence = 0.4906, and diversity = 0.95. While higher values of k (e.g., 20, 27 or 36) yield slightly increased coherence, they introduce excessive fragmentation and reduced interpretability of topics. According to the elbow principle, which defines the optimal point on a coherence curve where improvements transition from substantial to marginal, coherence gains show a significant jump at 15 topics followed by a plateau, indicating diminishing returns [80,81]. Thus, k = 15 represents the optimal balance between predictive accuracy, semantic clarity, and interpretability, consistent with prior work advocating the integration of statistical diagnostics with human validation [46].
As shown in Figure 3, topic coherence increases steadily up to around 15 topics where it experiences a notable peak (marked by the red circle), after which the curve shows fluctuations but no consistent substantial improvements, indicating that 15 topics represent the optimal balance between coherence and interpretability. The LDA model parameters are also provided below the figure for reference.
Table 4 presents the distribution of topics across 15 categories based on the analysis results, containing the titles, keywords, and total comments. A brief description of these topics is provided below:
Topic 0: Technical Support and Global Access to AI Tools
This topic reflects the practical barriers and socio-technical negotiations inherent in technology adoption. It moves beyond the AI-generated content to the infrastructure of access itself. User discussions about geo-restrictions (countri), payment requirements, and the need for VPNs serve as real-world examples of the external variables that influence user acceptance, a core concept in the foundational Technology Acceptance Model (TAM) [88]. These comments highlight that user engagement is not seamless but is actively shaped by platform policies and national (law) boundaries.
Topic 1: AI’s Impact on Reality, Jobs, and Human Creativity
This topic is a clear public discourse on the economic and philosophical implications of automation. The frequent use of keywords like jobs and replaced directly mirrors the concerns of technological unemployment, a central debate in labor economics literature. The discussion is framed within the tension of whether AI acts as a tool that complements human creativity or as a substitutive force for human labor, a distinction central to the work of Acemoglu and Restrepo (2019) [89].
Topic 2: The Uncanny Nature of AI-Generated Content
This topic perfectly encapsulates the esthetic phenomenon known as the “uncanny valley”, a foundational concept in robotics and human–computer interaction [33]. The keyword “uncanni” is a direct reference. Users’ comments focus on the unsettling feeling produced by content that is hyper-realistic yet subtly flawed. Their identification of imperfect details and things that are not “quite right” represents a public negotiation with non-human agents that mimic humanity imperfectly.
This dynamic of near-perfect mimicry remains a central issue in generative video technology. The persistence of the uncanny valley is evident even in state-of-the-art models such as OpenAI’s Sora 2, which has attracted widespread attention for its realistic outputs in late 2025. As demonstrated in Figure 4, the model can achieve a high degree of photorealism in replicating a human subject’s facial features.
However, the digitally generated expression exhibits a subtle hyper-realism that deviates from the neutral human baseline, a characteristic often associated with the uncanny valley effect. This tension between high-level accuracy and micro-level inconsistency is corroborated by external analyses, which note that while Sora 2 excels at rendering likenesses, it can falter in depicting complex physical interactions, such as the precise contact between fingers and piano keys [90]. This pattern—whereby overall realism is undermined by subtle contextual errors—aligns directly with our study’s findings, where users identify such “not quite right” details as a key element of their experience. It confirms that the uncanny valley remains a salient challenge in public perception, even for the most advanced generative systems.
Topic 3: The ‘Will Smith Eating Spaghetti’ Meme Benchmark
This topic isolates a specific cultural artifact—the “Will Smith Eating Spaghetti” video—which has become a memetic benchmark for AI progress and a clear example of participatory culture [91]. It operates as a public touchstone in two ways: technically, viewers use it to scrutinize failures in rendering complex physics, textures, and human anatomy; and culturally/esthetically, its distinctive uncanny “weirdness” makes it a durable signifier of the current state of AI video generation. Figure 5 illustrates how this artifact has evolved between 2022 and 2025 [92].
Topic 4: Belief, Influence, and the Cost of AI
This topic centers on the socio-economic dimensions of AI, particularly its perceived cost (expens) and its power to influence public belief. The discussions align with studies on media effects and persuasion. The recurring slang term cooked serves as a vernacular expression for a state of being deceived or economically doomed by the technology.
Topic 5: AI in Music, Advertising, and Entertainment
This topic captures user discussions around the application of generative AI in specific creative industries. It reflects the ongoing transformation of cultural production, a central theme in Media and Communication Studies. Comments about AI-generated songs and ads are a public appraisal of how these new tools are being integrated into familiar media forms.
Topic 6: Skepticism Towards AI Demonstrations (Google Veo)
This topic embodies a critical public discourse surrounding the corporate control of AI, aligning with the field of Platform Studies [93]. User skepticism is not directed at the technology’s potential, but at the platform (Googl) itself. Comments critique the company’s curated demos, high pay walls, and restrictive access, demonstrating an awareness of platforms as powerful, non-neutral actors.
Topic 7: The Artificial Nature of AI and User Reactions
This topic captures the raw, affective responses to AI’s artificiality. Comments highlight the bizarre (crazi) outputs, focusing on elements like the artificial-sounding voice. This reflects a user-level grappling with a new form of communication that lacks authenticity, forcing a constant re-evaluation of whats real.
Topic 8: Fear, Evidence, and Distrust in AI
This topic centers on the erosion of trust and “epistemic security”. The concern that AI undermines video evidence is a public echo of the legal and democratic challenges posed by deepfakes [16]. The dominant emotion is scare, rooted in the idea that our ability to tell difference between truth and fiction is being technologically compromised.
Topic 9: Absurd and Comical AI-Generated Scenarios
This topic highlights the emergent, often surreal humor found in AI’s failures. The focus on glitches with hands or bizarre background elements in otherwise serious contexts like a war scene points to a new form of digital comedy, where users find entertainment in unpredictable failures.
Topic 10: AI’s Role in the Film and Entertainment Industry
This topic addresses generative AI as a disruptive force in the creative industries. Comments on individuals creating their own film_movies versus the decline of Hollywood reflect a public debate on the democratization versus devaluing of creative labor, a central theme in media disruption theory.
Topic 11: The AI Video Generation Arms Race (Veo vs. Kling)
This central topic captures the competitive dynamics of the AI technology market. The direct comparison of features and quality between Google’s Veo and Kuaishou’s Kling illustrates how users perceive and participate in a technological “arms race”. This reflects broader studies in technology strategy and innovation, where competing standards and platform ecosystems vie for market and user dominance. The focus is on the tools themselves as objects of analysis.
Topic 12: AI’s Depiction of Surreal and Mythical Content
This topic highlights the use of AI to generate surreal, fantastical, and emotionally resonant content. The specific keyword bigfoot points to a subgenre of AI media that taps into folklore and myth, creating a form of modern digital storytelling. These comments, focusing on love and enjoyment, reveal an appreciation for AI’s ability to create dream-like and emotionally engaging narratives that exist outside the bounds of realism.
Topic 13: Trust, News, and Technological Advancements in Media
Similarly to Topic 8 but with a stronger focus on institutional media (news), this topic addresses the profound epistemological crisis spurred by AI. The central theme is the collapse of trust in visual media. The idea that society is cooked because one can no longer believe what a camera captures is a direct public response to the threat that deepfakes pose to journalism, historical records, and shared reality [16].
Topic 14: The Commercialization and Software Aspect of AI
This topic focuses on the business and development side of AI. Comments discuss commercial applications, start-up ideas, and the role of different studios. This aligns with research in innovation studies and strategic management, where the conversation is about market opportunities, business models, and the competitive landscape of a new technology.
Table 5 shows the Topic Model Diagnostics for a 15-topic LDA model. To interpret its performance, several key metrics were computed:
  • Coherence: This metric quantifies the human interpretability of a topic, measuring the semantic similarity between its top words. Higher scores indicate that the keywords form a coherent, logical concept (e.g., “car”, “engine”, “wheel”) rather than a random assortment [79,80];
C NPMI t = 1 N 2 i = 2 N j = 1 i 1   l o g P w i , w j P w i P w j l o g P w i , w j
  • Tokens: Serves as a proxy for a topic’s prevalence or “mass” within the corpus, approximating the total number of words assigned to a topic across all documents [45];
T t = d ϵ D d .   p t d
  • Exclusivity: Measures how unique a topic’s top words are relative to other topics. High values indicate lexical distinctiveness [94];
E t = 1 N   w ϵ W t ( N ) p w t k = 1   K p w k
  • Cosine Distance: Measures the average dissimilarity of a topic from all others in the word probability space [95]. As formalized in the equation below, this metric is calculated by averaging the cosine distance (1—cosine similarity) between the word-probability distribution of a given topic, P_t, and that of every other topic, P_s. Given that LDA topics are represented by high-dimensional and relatively sparse Bag-of-Words (BoW) vectors, a higher cosine distance indicates stronger lexical separation among topics. This is interpreted as a positive outcome, confirming that the model has successfully identified thematically distinct clusters.
C o s D i s t   t = 1 K 1   s t 1 p t . p s p t 2 p s 2
Evaluation of the topic headings through Topic Diagnostics Measurement demonstrates that the LDA applied to YouTube comments produces a compact and actionable map of public debate around AI. Topic 11 functions as the corpus anchor with the largest token mass (34,003) and solid coherence (≈0.74), reflecting a broad, connecting theme that links multiple sub-discourses. Topic 9 emerges as the most interpretable micro-topic (coherence ≈ 0.90) and thus represents a natural target for qualitative sampling.
Topic 10 also carries substantial mass (≈30,320 tokens) but exhibits lower exclusivity, suggesting an industry-level theme that overlaps with other topics and may require further disaggregation. In contrast, Topic 4 achieves very high exclusivity (≈0.97) despite only moderate coherence, indicating a concentrated vocabulary—useful for targeted monitoring once its semantic scope is clarified. Topic 14 combines high exclusivity (≈0.94) with the greatest cosine distance from other topics, signaling an isolated commercial/software sub-discourse of relatively small size that is well-suited for focused analysis. Topics 2 and 3 similarly isolate niche vocabularies (high exclusivity) around uncanny content and meme benchmarks, whereas Topic 1 surfaces a large, highly coherent conversation about AI’s impact on reality, work, and creativity. Several mid-mass topics, including Topics 5, 6, 8, 12, and 13, capture domain-specific debates—such as music, skepticism, distrust, surreal depictions, and media trust—with moderate coherence and useful exclusivity for downstream coding. One topic (Topic 7) exhibits very low coherence, reflecting fragmented or noisy comments that warrant preprocessing or cautious interpretation. A closer analysis reveals that this low coherence is an analytically significant finding rather than a model deficiency. Topic 7, with keywords like ‘crazi’, ‘whats real’, and ‘artifici’, captures the immediate, visceral, and lexically diverse reactions of users grappling with the artificiality of the content. Such spontaneous exclamations do not form tight, co-occurring word clusters, leading to a predictably low statistical coherence score. However, this topic is conceptually coherent as it represents a crucial dimension of user experience: the raw, unfiltered confusion and cognitive dissonance that occurs upon first encounter. Therefore, we retained it in our framework, acknowledging that its value is interpretive rather than statistical, and its inclusion prevents the loss of an important facet of public discourse. Taken together, the joint diagnostics—Coherence, Tokens, Exclusivity, and Cosine Distance—demonstrate that the model reliably identifies both dominant narratives for macro-level analysis and distinct, high-precision micro-topics for targeted study. Practitioners can therefore leverage high-coherence, high-token topics for trend analysis, and high-exclusivity/high-cosine-distance topics for specialized monitoring.

5.1. Visualization of Topics

To visualize the topics, the t-SNE algorithm [96] was applied to reduce high-dimensional LDA outputs into a two-dimensional space. Figure 6 presents the 15-topic distribution, where each review is represented as a scatter point. The size of each circle reflects the predicted probability of the review for its dominant topic, so reviews strongly associated with a topic appear larger.
A slight jitter was added to reduce point overlap, and pastel colors were used to differentiate topics. The t-SNE was configured with 2 components, perplexity of 35, learning rate of 400, maximum 250 iterations, and a random state of 54. Reviews positioned near other topics suggest content similarity or overlapping themes.

5.2. Hierarchical Classifications of Public Discourse: From Topics to Themes

This analysis of the public discourse surrounding generative AI contributes a novel conceptual framework derived directly from user-generated topics. The core academic contribution of this work is the development of a higher-order classification that organizes the 15 machine-identified topics into three analytically robust and theoretically meaningful categories. This classification—comprising Socio-Technical Systems and Platforms, AI-Generated Content and Esthetics, and Societal and Ethical Implications—provides a structured framework for interpreting the complex interplay between the technological and social dimensions of AI adoption. This framework is presented in Table 6.
To translate the statistical outputs of the LDA model into analytically meaningful insights, we implemented a structured, human-in-the-loop interpretive procedure. The topic-labeling process was conducted independently by two researchers on the study. Each researcher first examined the 15 highest-probability keywords and the five most representative comments (those with the highest probability of belonging to a given topic) for each of the 15 topics. Based on this holistic reading, they independently proposed a descriptive title for each topic. In order to evaluate the reliability of human coding, we calculated inter-rater reliability (IRR) similar to the studies of [97,98]. There are several measures commonly reported in the literature, comprising per cent agreement, Scott’s π [99], Krippendorff’s α [100], and Cohen’s κ [101]. As the data were not highly skewed, the results of these measures were consistent; we therefore report Cohen’s κ = 0.82, indicating “strong agreement” [102].
The range of opinions was discussed in a brief evaluation, and the final topic labels were determined by consensus. This human evaluation process ensured that the resulting topics were statistically coherent and semantically interpretable. This process enhanced the transparency and reproducibility of the model’s process.
To evaluate the reliability of this coding process, we calculated inter-rater reliability (IRR). As our data were not highly skewed, common measures were consistent; we report a Cohen’s κ of 0.82, indicating “strong agreement” according to established benchmarks. Minor discrepancies in labeling were then resolved through a deliberative process until full consensus was achieved. This multi-stage process, which moves from automated discovery to qualitative validation and thematic synthesis, ensures our findings are both computationally grounded and interpretively robust, reflecting established best practices for bridging quantitative and qualitative methodologies [46,72,103].
This classification facilitates focused scholarly inquiry, enabling researchers to link empirical findings to broader debates in Information Systems, Human–Computer Interaction, Communication, and AI Ethics, and to derive implications for platform design, creative practices, and normative governance.
Socio-Technical Systems and Platforms
Topics grouped under “Socio-Technical Systems and Platforms” focus on the technologies, companies, and competitive dynamics that shape the generative AI landscape. This classification is consistent with socio-technical theory, which posits that technology and its social context are mutually constitutive. Discussions in this category are not about the AI content itself, but the underlying infrastructure that produces and delivers it. This includes practical user concerns about accessibility, such as geo-restrictions and the need for specific software, as captured in Topic 0 (Technical Support and Global Access to AI Tools). It also reflects a critical awareness of corporate behavior, as seen in user discussions expressing skepticism about curated corporate demonstrations in Topic 6 (Skepticism Towards AI Demonstrations). Furthermore, this theme addresses the market dynamics, evident in conversations about the competitive “arms race” between different AI models in Topic 11 (The AI Video Generation Arms Race), and the broader commercial applications and software development discussed in Topic 14 (Commercial and Business Use of AI Software). Together, these topics illustrate how users perceive generative AI as a system of tools, corporate actors, and market forces, rather than a standalone technology.
AI-Generated Content and Esthetics
The “AI-Generated Content and Esthetics” category includes topics centered on the AI-generated media artifacts themselves—their esthetic qualities, cultural impact, and the audience’s direct interpretations. This theme captures the immediate, personal reactions to the content. These responses range from feelings of unease and fascination with hyper-realistic yet flawed creations, a phenomenon known as the “uncanny valley” and the focus of Topic 2 (The Uncanny Nature of AI-Generated Content), to the cognitive struggle to distinguish authentic from synthetic media in Topic 7 (The Artificial Nature of AI and User Reactions). Public discourse also reveals an appreciation for the unique creative capabilities of AI, such as its ability to generate surreal and mythical content in Topic 12 (AI’s Depiction of Surreal and Mythical Content) and the emergent, often accidental humor found in its technical glitches, as seen in Topic 9 (Absurd and Comical AI-Generated Scenarios). This category also encompasses how AI content is integrated into cultural practices, becoming a memetic benchmark for technological progress in Topic 3 (The ‘Will Smith Eating Spaghetti’ Meme Benchmark) or being applied in creative industries like music and advertising, as discussed in Topic 5 (AI in Music, Advertising, and Entertainment) and the broader film industry in Topic 10 (AI’s Role in the Film and Entertainment Industry). These topics collectively address the quality, style, and interpretation of AI’s creative output.
Societal and Ethical Implications
The “Societal and Ethical Implications” category encompasses the broad, macro-level consequences of generative AI, including debates on labor, truth, trust, and the future of media. This cluster forms a coherent unit grounded in established AI ethics and social impact frameworks, as the topics share a focus on the technology’s effects on human values and norms. Core concerns include the economic and philosophical impact of AI on jobs and human creativity, as captured in Topic 1 (AI’s Impact on Reality, Jobs, and Human Creativity). This theme also reflects deep-seated anxieties about the erosion of trust and the potential for manipulation. Topic 8 (Fear, Evidence, and Distrust in AI) highlights fears that AI will make it impossible to trust video evidence, leading to a state of “epistemic anxiety”. This concern is broadened in Topic 13 (Trust, News, and Technological Advancements in Media), which addresses the collapsing trust in institutional media. Finally, Topic 4 (Belief, Influence, and the Cost of AI) centers on the technology’s power to influence public belief and the potential for widespread deception. These topics cohere around the significant ethical questions posed by generative AI, mapping directly onto normative principles like fairness, accountability, and the potential for societal harm.
In summary, these three thematic categories reveal that public discourse on generative AI is not monolithic; rather, it is a multi-layered phenomenon. This discourse spans from practical and critical evaluations of the socio-technical systems that produce and deliver the technology—including platforms, corporate actors, and market dynamics—to immediate, personal reactions concerning the esthetic qualities and cultural impact of the content itself. Finally, it culminates in abstract, normative judgments about the technology’s broad societal and ethical implications, such as its effects on truth, labor, and trust.
Thus, these topics collectively form a comprehensive analytical unit. This framework demonstrates a coherent public logic that engages with generative AI simultaneously as a technological system, a cultural artifact, and a force for societal change, aligning with theories in communication, technology ethics, and public policy.
While the thematic framework presented in Table 6 effectively organizes the primary domains of public discourse, a deeper analysis reveals a hierarchical structure to these concerns. To better understand the interplay between individual experience and systemic impact, we can re-map these categories onto a nested model of analysis. This approach, presented in Table 7, reframes the public’s engagement with generative AI as a multi-layered process, scaling from direct, personal encounters with the technology to broad, abstract considerations of its societal consequences. This new perspective provides a more dynamic view of how micro-level reactions inform macro-level ethical debates, offering targeted insights for different stakeholders.

5.3. The Psychological Dynamics of Public Discourse: An ABC-Integrated Analysis

The framework presented in Table 7 reveals that the public discourse on AI video is not psychologically monolithic; rather, each thematic layer possesses its own distinct psychological architecture, defined by the dynamic interplay of Affective, Behavioral, and Cognitive components. Analyzing these dynamics moves beyond classification to explain how public attitudes are formed and structured.
AI-Generated Content and Esthetics: An Affect-Driven Cascade
At the most immediate level of interaction—the content itself—the psychological process is predominantly an affect-driven cascade. The initial engagement is visceral and emotional (Affect). The unsettling realism of Topic 2 (“The Uncanny Nature”) or the amusement at the glitches in Topic 9 (“Absurd Scenarios”) represents an instinctive feeling that arises before complex analysis. This powerful affective trigger then prompts a Cognitive appraisal, forcing the user to question the nature of what they are seeing (“Is this real or fake?” as in Topic 7). This cognitive work, in turn, fuels Behavioral outcomes. Content that evokes strong affect is more likely to be shared, remixed into memes (as with the ‘Will Smith’ benchmark in Topic 3), or integrated into cultural production (Topic 5), completing the cascade from feeling to thinking to acting.
Socio-Technical Systems and Platforms: A Behavior-Centric Engagement
When the focus shifts to the underlying infrastructure, the psychological dynamic inverts to become behavior-centric. The primary driver here is the user’s goal-oriented action or intention (Behavior). Discussions are dominated by the practicalities of using, accessing, and comparing the technology, as seen in Topic 0 (navigating access barriers) and Topic 11 (the “arms race” between tools). This hands-on engagement directly informs Cognitive judgments about the corporate actors behind the platforms. Experiencing a paywall or a curated demo leads to skepticism and strategic beliefs about corporate motives (Topic 6). The Affective dimension, such as frustration with access or excitement about a new feature, is often a consequence of these behavioral and cognitive processes, rather than their trigger.
Societal and Ethical Implications: A Cognitive-Affective Feedback Loop
At the most abstract, macro-level, the discourse is characterized by a powerful cognitive-affective feedback loop. The process begins with a core Cognitive belief or judgment about AI’s fundamental impact on society—its threat to jobs, reality, and truth (Topic 1, Topic 13). This abstract belief is not emotionally neutral; it fuels a potent Affective response, primarily fear, anxiety, and distrust (Topic 8). This raw fear then acts as a powerful lens that reinforces and deepens the initial cognitive belief, creating a self-perpetuating cycle of concern. The Behavioral dimension in this theme is largely an outcome of this intensified attitude, manifesting as intentions to adapt media consumption habits or advocate for regulation. The action is a response to a deeply held, emotionally charged worldview.
In summary, by analyzing the interplay within the ABC model, we demonstrate that public reactions to generative AI are not uniform. They are structured by distinct psychological mechanisms depending on whether the user is engaging with the artifact, the platform, or the societal implications.

5.4. Key Findings in Response to Research Questions

This study examines how ordinary users perceive AI-generated videos on major public platforms. Through a systematic analysis of user comments, it reveals the multifaceted ways in which individuals engage with AI-generated content, encompassing technical, esthetic, ethical, and societal dimensions. The investigation is guided by four principal research questions: (1) what users express about AI videos, (2) how these responses can be organized into coherent thematic categories, (3) which themes predominate in everyday discourse, and (4) how perceptions of realism and visual fidelity shape user reactions. The answers to these questions are summarized below:

5.4.1. (RQ1) What Do Ordinary Users Say They Think About AI-Generated Videos on Major Public Platforms?

Ordinary users engage in a multifaceted discourse about AI-generated videos that is far from monolithic, covering a spectrum of practical, esthetic, and ethical concerns. Rather than being a disordered collection of opinions, these user discussions align coherently with the study’s three-tiered thematic framework. Based on the 15 topics identified, user commentary can be systematically organized as follows:
  • Socio-Technical Systems and Platforms: A significant portion of the discourse centers on the underlying infrastructure that produces and delivers AI content. This includes practical user concerns about accessibility, such as geo-restrictions and the need for specific software (Topic 0), as well as a critical awareness of corporate behavior, as seen in user skepticism toward curated demonstrations (Topic 6). Furthermore, users are keenly aware of the market dynamics, frequently discussing the competitive “arms race” between different AI models (Topic 11) and the broader commercial and business applications of the technology (Topic 14).
  • AI-Generated Content and Esthetics: This theme captures users’ immediate and direct reactions to the media artifacts themselves. The commentary is rich with esthetic and affective evaluations, ranging from feelings of unease with the hyper-realistic yet flawed “uncanny valley” effect (Topic 2) to the cognitive struggle to distinguish authentic from synthetic media, a core element of Topic 7 (“The Artificial Nature of AI and User Reactions”). Users also find amusement at the emergent, often accidental humor in technical glitches (Topic 9) and express appreciation for AI’s unique creative capabilities, such as its ability to generate surreal and mythical content (Topic 12). This discourse extends to how AI content is integrated into cultural practices—becoming a memetic benchmark for technological progress (Topic 3) or being applied in creative industries like music, advertising, and film (Topic 5, Topic 10).
  • Societal and Ethical Implications: Finally, users grapple with the broad, macro-level consequences of generative AI. These discussions are grounded in deep-seated ethical and normative concerns. Core among them is the technology’s perceived impact on jobs, human creativity, and the very fabric of reality (Topic 1). This theme also reflects profound anxieties about the erosion of trust and the potential for manipulation. Users express significant fear that AI will make it impossible to trust video evidence, leading to a state of “epistemic security” collapse (Topic 8), a concern that broadens to include the collapsing trust in institutional media (Topic 13). These anxieties cohere around the significant ethical questions posed by generative AI, mapping directly onto its power to influence public belief and the potential for widespread societal harm (Topic 4).

5.4.2. (RQ2) Can These Natural Responses Be Grouped into a Small Set of Recurring Thematic Categories?

Yes. The study’s central contribution is demonstrating that these diverse topics can be coherently organized into a higher-order conceptual framework. As presented in Table 5, the 15 discrete topics are grouped into three overarching themes:
  • Socio-Technical Systems and Platforms: This category consolidates discussions about the technology itself—the tools, the companies that control them, their accessibility, and the competitive market dynamics.
  • AI-Generated Content and Esthetics: This groups topics focused on the media artifacts produced by AI, including their esthetic qualities (e.g., uncanny, comical, surreal), cultural impact, and genre applications.
  • Societal and Ethical Implications: This theme encompasses the broader societal consequences, including debates on labor displacement, the erosion of truth and trust, and widespread fear about the future of media.

5.4.3. (RQ3) Which Themes Dominate Everyday Discourse and Which Themes Are Comparatively Rare?

The analysis of token mass and comment counts indicates a clear hierarchy of themes.
  • Dominant Themes: The discourse is most heavily dominated by topics centered on specific cultural artifacts and immediate, visceral reactions. The single most discussed topic is “The ‘Will Smith Eating Spaghetti’ Meme Benchmark” (Topic 3) with 1652 comments, indicating that tangible, memetic touchstones are powerful drivers of conversation. This is followed closely by discussions of the “The Uncanny Nature of AI-Generated Content” (Topic 2) with 1240 comments, highlighting the prevalence of esthetic and affective user responses. Practical issues also generate significant engagement, with “Technical Support and Global Access to AI Tools” (Topic 0) attracting 1,064 comments.
  • Comparatively Rare Themes: In contrast, topics requiring more abstract or industry-specific knowledge receive significantly less direct engagement. The least discussed topic is “Skepticism Towards AI Demonstrations (Google Veo)” (Topic 6), with only 395 comments, suggesting that critique of specific corporate demos is a niche conversation. Similarly, discussions around “AI’s Role in the Film and Entertainment Industry” (Topic 10) (514 comments) and “Belief, Influence, and the Cost of AI” (Topic 4) (524 comments) represent smaller sub-discourses compared to the dominant themes.
In summary, realism in AI-generated video operates as a double-edged sword. It is both the source of its greatest perceived threat (the inability to trust our eyes) and the foundation of its greatest appeal (the power to create stunning and previously unimaginable visual experiences). The user reactions are therefore not simply negative but reflect a fundamental tension between the technology’s potential for deception and its potential for artistic and creative expression.

6. Conclusions

This study advances the understanding of public responses to generative AI by integrating empirical evidence from a substantial YouTube comment corpus with established conceptual frameworks. Rather than treating public opinion as a single, uniform attitude, our topic-centered analysis reveals a plural and structured set of concerns that consistently cluster into three higher-order dimensions: Socio-Technical Systems and Platforms; AI-Generated Content and Esthetics; and Societal and Ethical Implications. These dimensions capture how everyday viewers negotiate issues of platform power, interpret esthetic qualities like the “uncanny”, and grapple with the epistemic and labor-related consequences of synthetic media. The resulting taxonomy provides a focused portrait of the key issues citizens attend to when confronted with this transformative technology.
While existing literature provides a crucial foundation, our framework offers a more refined and comprehensive map of public discourse. Previous computational analyses successfully identified foundational dichotomies, such as the split between creative applications and ethical concerns on Reddit [6] or the tension between amusement and fear on YouTube [22]. Our three-tiered model moves beyond such binary classifications to provide a more granular structure. Furthermore, experimental work by Seo et al. (2025) [18] and Belanche et al. (2025) [19] compellingly demonstrated the importance of authenticity and trust in controlled, commercial contexts; our study complements this by analyzing spontaneous, naturalistic discourse on a broad video-native platform. Ultimately, where methodological studies by Gupta et al. (2024) [20], and Sun et al. (2025) [21] established the viability of applying topic modeling to academic literature or short-text corpora, this work operationalizes those methods at scale to produce a substantive theoretical contribution. By synthesizing large-scale, “in-the-wild” data analysis with the established ABC model of attitudes, this study produces a uniquely structured and explanatory taxonomy of public reactions that integrates technical, esthetic, and ethical concerns.

6.1. Theoretical Contributions

This study’s primary theoretical contribution is the articulation of a multi-layered psychological architecture of public discourse surrounding a disruptive technology. We advance the application of the ABC model of attitudes [24] beyond simple validation or static classification. Crucially, we demonstrate that the relationship between Affect, Behavior, and Cognition is not fixed, but rather reconfigures dynamically across different domains of engagement. Our integrated analysis, detailed in Section 5.3, reveals three distinct “psychological signatures” that characterize public reactions to generative AI:
  • An affect-driven cascade defines immediate reactions to AI-generated media artifacts, where visceral feelings (e.g., the unease of the “uncanny valley”) precede and trigger cognitive appraisal (e.g., questioning authenticity) and subsequent behavioral engagement (e.g., memetic sharing).
  • A behavior-centric process characterizes interactions with the underlying socio-technical platforms, where practical, goal-oriented actions (e.g., navigating access barriers or comparing tools) shape strategic cognitive judgments about corporate actors and resultant affective states like frustration or excitement.
  • A cognitive-affective feedback loop structures the abstract societal and ethical debates, where deeply held cognitive beliefs about AI’s impact (e.g., on jobs and truth) and potent affective responses (e.g., fear and anxiety) mutually reinforce one another, culminating in behavioral intentions such as calls for regulation.
By empirically identifying these distinct psychological mechanisms, this research offers a significant theoretical refinement to the study of public perception of technology. Where much of the literature treats user attitudes as a monolithic construct—often reducing them to a single score of “trust”, “acceptance”, or a simple positive/negative valence [11,25]—our framework provides a data-driven, multi-dimensional counter-narrative. We demonstrate that public opinion is not just a collection of disparate views but a structured system with a variable psychological architecture. This dynamic model explains how individuals can simultaneously harbor fascination and fear, creative adoption and deep-seated skepticism, as these reactions are channeled through different psychological pathways depending on the object of their focus: the content, the platform, or the societal consequence.
This nuanced understanding moves the inquiry from what people think about generative AI to the more fundamental question of how these complex public attitudes are constructed, maintained, and evolve. This dynamic, multi-layered framework provides a more robust theoretical foundation for future research into societal adaptation to emerging technologies and offers a clear, empirically grounded model for making sense of the complex interplay between innovation and public sentiment.
Beyond this core theoretical framework, the findings offer nuance to dominant academic debates.
First, by foregrounding topic-level heterogeneity, we refine accounts of “public trust”, demonstrating that trust-related anxieties are differentiated—distinguishing between institutional (news/media) and evidentiary (camera-as-proof) concerns, each carrying distinct normative implications. This empirical differentiation of trust into institutional and evidentiary concerns adds significant granularity to the broader discussions on epistemic security and algorithmic accountability found in legal and ethical scholarship [16,42]. Where the literature often treats the “erosion of trust” as a monolithic concept, our findings suggest it is a fragmented crisis, requiring targeted, domain-specific interventions rather than a one-size-fits-all solution. Second, the findings clarify the creative value and labor debate by showing that public discourse simultaneously valorizes novel, meme-driven cultural practices while expressing deep anxiety about job displacement and the devaluation of human authorship.
Ultimately, the topical map illuminates two central tensions that define the public’s relationship with generative AI:
  • Esthetic Engagement vs. Epistemic Anxiety: Users are fascinated by the creative frontiers of AI video but are equally fearful that the same technology is eroding the foundations of trust in visual media. The more realistic the content, the greater the anxiety. This tension directly reflects the foundational concept of the ‘uncanny valley’ [33], where hyper-realism triggers affective discomfort, a phenomenon that our findings show is now intrinsically linked to the cognitive fear of pervasive, undetectable deepfakes [24].
  • Democratization of Creation vs. Centralization of Power: While AI tools promise to democratize media production, the public discourse reveals an acute awareness that the technology’s development and control are concentrated in the hands of a few powerful corporate actors. This dichotomy speaks directly to the core tenets of platform studies, which interrogate the non-neutral role of corporate actors in shaping public discourse and access [54,93]. While the technology promises democratization, the public is keenly aware that the underlying infrastructure operates within a framework of corporate governance and market logic, echoing concerns raised in the literature about platform politics.
This reframes the public conversation away from abstract technophobia and toward a set of tangible, concrete socio-technical problems: issues of access (who can use these tools and at what cost?), authenticity (how can we verify what we see?), labor (what is the future of human creativity?), and governance (who is responsible for the consequences?).

6.2. Practical Contribution

Grounded directly in the thematic patterns of public discourse identified in our analysis, this study offers targeted, evidence-based recommendations for platforms, designers, regulators, and cultural institutions. Each recommendation is a direct response to the specific concerns and behaviors voiced by users.
Prioritized Thematic Monitoring: Our analysis reveals that public concern is not a generic “AI worry” but is clustered around specific, high-salience topics. Platforms and policymakers should therefore move beyond broad sentiment tracking and instead monitor the distinct signals identified in this study. For instance, the intense user focus on the “AI Video Generation Arms Race” (Topic 11) indicates a need for clear communication on competitive features, while the prevalence of “Fear, Evidence, and Distrust in AI” (Topic 8) and reactions to “The Uncanny Nature” (Topic 2) signals emergent risks to user trust and well-being that require proactive content moderation and design interventions.
Provenance and Attribution Workflows: The profound “epistemic anxiety” expressed by users in topics concerning the erosion of video evidence (Topic 8) and the collapse of trust in institutional media (Topic 13) demonstrates an urgent public demand for reliable signals of authenticity. To address these fears, platforms should implement robust, interoperable provenance metadata (e.g., C2PA standards) and clear, user-facing attribution for all AI-generated content as a normative default, not an optional feature.
Esthetic-Sensitive Product Design: The strong, often negative, affective reactions captured in Topic 2 (The Uncanny Nature) highlight the risks of uncurated, hyper-realistic outputs. Concurrently, the user appreciation for emergent humor in Topic 9 (Absurd and Comical Scenarios) shows an appetite for non-photorealistic, creative applications. Product design teams should respond to this by incorporating human-in-the-loop curation, requiring visible attribution, and considering design constraints (e.g., on photorealism in high-risk domains like political or news-related content) to mitigate negative affective responses and channel creative potential responsibly.
Sectoral Literacy and Governance: The specific anxieties about the future of journalism and institutional media found in Topic 13 (Trust, News, and Technological Advancements) point to the need for domain-specific interventions. Coordinated public education campaigns targeting journalists, legal professionals, and courts are crucial for building sectoral literacy on how to verify and use synthetic media. This should be paired with the development of clear evidentiary standards to preserve information integrity while permitting beneficial uses of generative tools.
Support for Creative Ecosystems: The public discourse reveals a central tension between AI as a tool for creative democratization and a force for labor devaluation, as seen in Topic 1 (“AI’s Impact on Reality, Jobs, and Human Creativity”) and Topic 10 (“AI’s Role in the Film and Entertainment Industry”). To navigate this, platforms and guilds should develop reputation mechanisms (e.g., verified creator programs that distinguish human-led work) and policies that recognize and support hybrid human-AI creative practices, ensuring that new technologies sustain rather than erode remunerative pathways for creators.
These topic-driven recommendations map operational priorities directly onto the substantive concerns voiced by users, providing stakeholders with an evidence-informed set of interventions to align technological practice with social expectations. This empirical grounding provides a clear and actionable agenda for building a more responsible and trustworthy AI ecosystem.

7. Limitation and Future Studies

While this study presents a robust, data-driven framework of public discourse on generative AI, it is important to acknowledge several limitations that provide context for our findings and illuminate critical pathways for future research.
The primary limitation is the specificity of our dataset, which defines the scope of our findings. Our analysis is intentionally framed as a case study focused on public reactions to a high-profile corporate technology launch (Google’s Veo 3) within a constrained timeframe. Consequently, the results are a snapshot of a particular moment, and the distribution of topics may be influenced by this specific context (e.g., heightened skepticism towards a large corporation). Furthermore, the dataset reflects the views of a self-selected, English-speaking user base, whose discourse is inevitably shaped by the norms and demographics of the YouTube community. Therefore, while this approach offers a highly contextualized and detailed map of public concerns at a critical juncture, the findings are not directly generalizable to all forms of AI-generated content, such as independent artistic creations or political deepfakes, nor do they represent a global perspective.
Methodologically, while LDA excels at identifying what topics are salient, it cannot, by itself, explain why they emerge or establish causal links between content exposure and shifts in belief. Our topic-centered approach successfully maps public concerns but does not explain their underlying causal drivers. While LDA was intentionally chosen for its transparency and interpretability, the robustness of our findings could be further strengthened. Future work should therefore compare our framework against alternative embedding-based neural topic models (e.g., BERTopic, Top2Vec), which are often more effective for short, informal texts.
These limitations, however, do not diminish the findings but instead illuminate a clear and compelling agenda for future inquiry. A high priority is to test the external validity of our thematic framework through cross-platform and multilingual replication, which would capture a more diverse range of user populations and cultural contexts. Future work should expand sampling to include discourse on platforms like TikTok and Reddit, and analyze reactions to varied content, from political deepfakes to artistic experiments. Building upon the qualitative robustness demonstrated in our study, such research could also perform quantitative sensitivity analyses to test how the prevalence of these core themes shifts across different content genres (e.g., corporate demos versus independent artistic creations).Such work could be enhanced by longitudinal tracking to monitor how these topics evolve over time as the technology matures and public literacy increases. To move from thematic mapping to causal understanding, mixed-methods research is essential. Our descriptive framework generates several testable hypotheses for such future work. For example, based on the prevalence of Topic 2 (“The Uncanny Nature”), one could hypothesize that exposure to hyper-realistic but subtly flawed AI videos primes users for greater skepticism towards all digital content. Similarly, stemming from the ‘epistemic anxiety’ in Topics 8 and 13, a follow-up experiment could test the hypothesis that clear provenance labels (e.g., C2PA) are more effective at restoring trust in news-related content than in entertainment content. Qualitative interviews can provide deeper context, while controlled experiments could rigorously evaluate the impact of interventions—such as provenance systems—on user trust. By pursuing these avenues, the research community can build upon the foundational insights of this study to develop a more comprehensive and actionable understanding of the dynamic relationship between generative AI and society.

Author Contributions

Conceptualization, L.Ç. and B.A.Ç.; methodology, L.Ç. and B.A.Ç.; software, L.Ç.; validation, L.Ç. and B.A.Ç.; formal analysis, L.Ç. and B.A.Ç.; investigation, L.Ç. and B.A.Ç.; resources, L.Ç. and B.A.Ç.; data curation, L.Ç. and B.A.Ç.; writing—original draft preparation, L.Ç. and B.A.Ç.; writing—review and editing, L.Ç. and B.A.Ç.; visualization, L.Ç.; supervision, L.Ç. and B.A.Ç.; project administration, L.Ç. and B.A.Ç. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors would like to express their sincere gratitude to the four anonymous reviewers and the handling editor for their insightful comments and constructive feedback, which have significantly improved the quality of this work. The authors also dedicate this study to their son, Utku, whose endless curiosity and passion for discovery have been a constant source of inspiration throughout this research.

Conflicts of Interest

The authors declare that they have no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIArtificial Intelligence
LDALatent Dirichlet Allocation
ABCAffective, Behavioral, and Cognitive
ITUInternational Telecommunication Union
C2PACoalition for Content Provenance and Authenticity
RQResearch Question
NPMINormalized Pointwise Mutual Information
TAMTechnology Acceptance Model
t-SNEt-Distributed Stochastic Neighbor Embedding

Appendix A

Table A1. Basic Metadata of Analyzed Videos.
Table A1. Basic Metadata of Analyzed Videos.
Video IDTitleChannel TitlePublished AtView Like
7o3mZuhbse8Google’s Veo 3 ComparisonChrissie24 May 2025180,2901206
Iv24AUN8Yd0Real Or Google Veo 3 AI? Watch the tutorial!Adrian Viral AI Marketing 3 June 2025221,233714
H7GC_qee6E4A.I. Has Officially Gone Too Far Systems 13 00925 i002|Google Veo 3 is INSANE Systems 13 00925 i003Edited By Geo1 June 20253,433,09679,878
j4CT5dZe8ZABigfoot—Born to be Bushy (Official Music Video)|Google Veo 3demonflyingfox1 June 20251,738,34719,551
-UW6nMGN2BwImpossible Challenges 2 (Google Veo 3)demonflyingfox29 May 2025337,5335646
hqlHrK5SEucGoogle Veo 3 Street InterviewGOD21 May 2025223,454839
TmsK_Ym8kD4Google Veo 3 Demo|Cinematic Scenes and Character Voices (Honest Filmmaker Review)Black Mixture23 May 2025242,2381545
gcZwE5cM4xsGoogle’s new AI video tool Veo 3 is WILD!Impekable29 May 2025314,6841733
j8VGP5pr9OQCinematic Glitches. Veo 3 + Midjourney V7VaigueMan7 June 2025333,0746575
gwhSPf3S89MVeo 3 vs. Kling 2.1 Master Systems 13 00925 i001 Cinematic Showdown—Who Wins? Systems 13 00925 i004Aivoxy28 May 2025410,4724074
6j1TqZDn6xMMade with Google’s Veo 3 model, they look you in the eye and break the fourth wall,EDUCATION & TECHNOLOGY11 June 20252,995,52240,437
US2gO7UYEfYWe Tested Google Veo and Runway to Create This AI Film. It Was Wild.|WSJThe Wall Street Journal28 May 20251,025,02223,085
CxX92BBhHBwImpossible Challenges (Google Veo 3)demonflyingfox27 May 2025853,87414,547
McFChYae6p8Google Veo 3 Ai Gives You An Existential CrisisWulfranz25 May 2025291,3173473
rwUt22HTTx0Anchors away. Systems 13 00925 i005️ Veo 3 is rolling out in 70+ countries and Google AI Pro subscribers can try it tooGoogle24 May 2025684,4304567
HQ6BDMoKHcsGoogle Veo 3 is sooo cool! #ai #veo3 #prompttheoryWorld Update 3.030 May 2025643,19844,048
01Fm4mqIq08Meet Beatboxing Blobfish, made by @AlexanderChen @MathewRayMakes with Veo 3 Systems 13 00925 i006Systems 13 00925 i007Google31 May 20251,653,51955,513
UC_Cw9xqIuEFREE Veo 3 AI Video Generator: How to Use It WORLDWIDEHow To In 5 Minutes2 June 2025496,9227159
XkpGkAa1nCYGoogle’s Veo 3 can now generate audio.The Verge20 May 2025302,5183443
2T-ZiEdMHvwVeo3 test // non-existent car showLászló Gaál 22 May 2025391,2512573
ODyROOW1dCoMeet Veo 3, our latest video generation modelGoogle31 May 2025141,8071204
DY5vnaCx_KEA Time Traveler’s VLOG|Google VEO 3 AI Short Film + Assets Availableuisato4 June 2025172,5753873

References

  1. Kharvi, P.L. Understanding the Impact of AI-Generated Deepfakes on Public Opinion, Political Discourse, and Personal Security in Social Media. IEEE Secur. Priv. 2024, 22, 115–122. [Google Scholar] [CrossRef]
  2. Sala, A. Standards and Policy Considerations for Multimedia Authenticity. 2025. Available online: https://www.itu.int/hub/2025/07/standards-and-policy-considerations-for-multimedia-authenticity/ (accessed on 10 October 2025).
  3. Florance, M.S. Survey Reveals Concerns and Adoption Trends Around AI’s Rising Influence; Rutgers Office of Communications: New Brunswick, NJ, USA, 2024; pp. 1–6. Available online: https://www.rutgers.edu/news/survey-reveals-concerns-and-adoption-trends-around-ais-rising-influence (accessed on 10 October 2025).
  4. Gottfried, J. About Three-Quarters of Americans Favor Steps to Restrict Altered Videos and Images; Pew Research Center: Washington, DC, USA, 2019; pp. 16–18. Available online: https://www.pewresearch.org/short-reads/2019/06/14/about-three-quarters-of-americans-favor-steps-to-restrict-altered-videos-and-images (accessed on 10 October 2025).
  5. Hynek, N.; Gavurova, B.; Kubak, M. Risks and benefits of artificial intelligence deepfakes: Systematic review and comparison of public attitudes in seven European Countries. J. Innov. Knowl. 2025, 10, 100782. [Google Scholar] [CrossRef]
  6. Xu, Z.; Wen, X.; Zhong, G.; Fang, Q. Public perception towards deepfake through topic modelling and sentiment analysis of social media data. Soc. Netw. Anal. Min. 2025, 15, 16. [Google Scholar] [CrossRef]
  7. Henry, A.; Glick, J. WITNESS and MIT Open Documentary Lab, Just joking! Deepfakes, Satire and the Politics of Synthetic Media. 2021. Available online: https://cocreationstudio.mit.edu/just-joking/ (accessed on 10 October 2025).
  8. Le Poidevin, O. UN Report Urges Stronger Measures to Detect AI-Driven Deepfakes; ReutersCom: London, UK, 2025; Available online: https://www.reuters.com/business/un-report-urges-stronger-measures-detect-ai-driven-deepfakes-2025-07-11/ (accessed on 10 October 2025).
  9. Capstick, E. Chapter 8: Public Opinion. In Artificial Intelligence Index Report 2025; Stanford Institute for Human-Centered Artificial Intelligence (HAI): Stanford, CA, USA, 2025; pp. 1–21. Available online: https://hai.stanford.edu/assets/files/hai_ai-index-report-2025_chapter8_final.pdf (accessed on 10 October 2025).
  10. Mcclain, B.Y.C.; Kennedy, B.; Gottfried, J.; Anderson, M.; Pasquini, G. How the U.S. Public and AI Experts View Artificial Intelligence. 2025. Available online: https://www.pewresearch.org/internet/2025/04/03/how-the-us-public-and-ai-experts-view-artificial-intelligence/ (accessed on 10 October 2025).
  11. Puntoni, S.; Reczek, R.W.; Giesler, M.; Botti, S. Consumers and Artificial Intelligence: An Experiential Perspective. J. Mark. 2020, 85, 131–151. [Google Scholar] [CrossRef]
  12. Demmer, T.R.; Kühnapfel, C.; Fingerhut, J.; Pelowski, M. Does an emotional connection to art really require a human artist? Emotion and intentionality responses to AI- versus human-created art and impact on aesthetic experience. Comput. Hum. Behav. 2023, 148, 107875. [Google Scholar] [CrossRef]
  13. Lazer, D.M.J.; Baum, M.A.; Benkler, Y.; Berinsky, A.J.; Greenhill, K.M.; Menczer, F.; Metzger, M.J.; Nyhan, B.; Pennycook, G.; Rothschild, D.; et al. The science of fake news. Science 2018, 359, 1094–1096. [Google Scholar] [CrossRef] [PubMed]
  14. Vosoughi, S.; Roy, D.; Aral, S. The spread of true and false news online. Science 2018, 359, 1146–1151. [Google Scholar] [CrossRef]
  15. Berger, J.; Milkman, K.L. What Makes Online Content Viral? J. Mark. Res. 2012, 49, 192–205. Available online: https://journals.sagepub.com/doi/10.1509/jmr.10.0353 (accessed on 10 October 2025). [CrossRef]
  16. Chesney, B.; Citron, D. Deep fakes: A looming challenge for privacy, democracy, and national security. Calif. Law Rev. 2019, 107, 1753–1820. [Google Scholar] [CrossRef]
  17. Holleman, G.A.; Hooge, I.T.C.; Kemner, C.; Hessels, R.S. The ‘Real-World Approach’ and Its Problems: A Critique of the Term Ecological Validity. Front. Psychol. 2020, 11, 1–12. [Google Scholar] [CrossRef]
  18. Seo, I.T.; Liu, H.; Li, H.; Lee, J.S. AI-infused video marketing: Exploring the influence of AI-generated tourism videos on tourist decision-making. Tour Manag. 2025, 110, 105182. [Google Scholar] [CrossRef]
  19. Belanche, D.; Ibáñez-Sánchez, S.; Jordán, P.; Matas, S. Customer reactions to generative AI vs. real images in high-involvement and hedonic services. Int. J. Inf. Manag. 2025, 85, 102954. [Google Scholar] [CrossRef]
  20. Gupta, P.; Ding, B.; Guan, C.; Ding, D. Generative AI: A systematic review using topic modelling techniques. Data Inf. Manag. 2024, 8, 100066. [Google Scholar] [CrossRef]
  21. Sun, Y.; Tsuruta, H.; Kumagai, M.; Kurosaki, K. YouTube-based topic modeling and large language model sentiment analysis of Japanese online discourse on nuclear energy. J. Nucl. Sci. Technol. 2025, 1–13. [Google Scholar] [CrossRef]
  22. Kaya, S. Investigation of User Comments on Videos Generated by Deepfake Technology. Acta Infologica 2025, 9, 208–222. [Google Scholar] [CrossRef]
  23. Solomon, M.R.; Bamossy, G.J.; Askegaard, S.T.; Hogg, M.K. Consumer Behaviour: A European Perspective-Pearson Education Limited; Pearson Education: London, UK, 2013; 672p. [Google Scholar]
  24. PBS News Hour. The Potentially Dangerous Implications of an AI Tool Creating Extremely Realistic Video. 2024. Available online: https://www.pbs.org/newshour/show/the-potentially-dangerous-implications-of-an-ai-tool-creating-extremely-realistic-video (accessed on 14 August 2025).
  25. Grewal, D.; Guha, A.; Satornino, C.B.; Schweiger, E.B. Artificial intelligence: The light and the darkness. J. Bus. Res. 2021, 136, 229–236. [Google Scholar] [CrossRef]
  26. Ma, Y.M.; Dai, X.; Deng, Z. Using machine learning to investigate consumers’ emotions: The spillover effect of AI defeating people on consumers’ attitudes toward AI companies. Internet Res. 2024, 34, 1679–1713. [Google Scholar] [CrossRef]
  27. Mirbabaie, M.; Brünker, F.; Möllmann Frick, N.R.J.; Stieglitz, S. The rise of artificial intelligence—Understanding the AI identity threat at the workplace. Electron Mark. 2022, 32, 73–99. [Google Scholar] [CrossRef]
  28. Rana, N.P.; Chatterjee, S.; Dwivedi, Y.K.; Akter, S. Understanding dark side of artificial intelligence (AI) integrated business analytics: Assessing firm’s operational inefficiency and competitiveness. Eur. J. Inf. Syst. 2022, 31, 364–387. [Google Scholar] [CrossRef]
  29. Gligor, D.M.; Pillai, K.G.; Golgeci, I. Theorizing the dark side of business-to-business relationships in the era of AI, big data, and blockchain. J. Bus. Res. 2021, 133, 79–88. [Google Scholar] [CrossRef]
  30. Sun, Y.; Li, S.; Yu, L. The dark sides of AI personal assistant: Effects of service failure on user continuance intention. Electron Mark. 2022, 32, 17–39. [Google Scholar] [CrossRef]
  31. Castillo, D.; Canhoto, A.I.; Said, E. The dark side of AI-powered service interactions: Exploring the process of co-destruction from the customer perspective. Serv. Ind. J. 2021, 41, 900–925. [Google Scholar] [CrossRef]
  32. Tong, S.; Jia, N.; Luo, X.; Fang, Z. The Janus face of artificial intelligence feedback: Deployment versus disclosure effects on employee performance. Strateg. Manag. J. 2021, 42, 1600–1631. [Google Scholar] [CrossRef]
  33. Mori, M. The uncanny valley. IEEE Robot. Autom. Mag. 2012, 19, 98–100. [Google Scholar] [CrossRef]
  34. Liu, Q.; Lian, Z.; Osman, L.H. Can Artificial Intelligence-Generated Sponsored Vlogs Trigger Online Shopping? Exploring the Impact on Consumer Purchase Intentions. J. Promot. Manag. 2025, 31, 798–830. [Google Scholar] [CrossRef]
  35. Rahman, A.; Naji, J. The Era of AI-Generated Video Production Exploring Consumers’ Attitudes. 2024. Available online: https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1868060&dswid=-6145 (accessed on 14 August 2025).
  36. Arango, L.; Singaraju, S.P.; Niininen, O. Consumer Responses to AI-Generated Charitable Giving Ads. J. Advert. 2023, 52, 486–503. [Google Scholar] [CrossRef]
  37. Roman, D. AI-Generated Videos: Innovation, Risks & Rewards. 2023. Available online: https://wearebrain.com/blog/era-of-ai-generated-videos/ (accessed on 14 August 2025).
  38. Madathil, J.C. Generative AI advertisements and Human–AI collaboration: The role of humans as gatekeepers of humanity. J. Retail. Consum. Serv. 2025, 87, 104381. [Google Scholar] [CrossRef]
  39. Cao, G.; Duan, Y.; Edwards, J.S.; Dwivedi, Y.K. Understanding managers’ attitudes and behavioral intentions towards using artificial intelligence for organizational decision-making. Technovation 2021, 106, 102312. [Google Scholar] [CrossRef]
  40. Kshetri, N.; Dwivedi, Y.K.; Davenport, T.H.; Panteli, N. Generative artificial intelligence in marketing: Applications, opportunities, challenges, and research agenda. Int. J. Inf. Manag. 2024, 75, 102716. [Google Scholar] [CrossRef]
  41. Latikka, R.; Bergdahl, J.; Savela, N.; Oksanen, A. AI as an Artist? A Two-Wave Survey Study on Attitudes Toward Using Artificial Intelligence in Art. Poetics 2023, 101, 101839. [Google Scholar] [CrossRef]
  42. Yu, T.; Tian, Y.; Chen, Y.; Huang, Y.; Pan, Y.; Jang, W. How Do Ethical Factors Affect User Trust and Adoption Intentions of AI-Generated Content Tools? Evidence from a Risk-Trust Perspective. Systems 2025, 13, 461. [Google Scholar] [CrossRef]
  43. Lao, Y.; Hirvonen, N.; Larsson, S. AI and authenticity: Young people’s practices of information credibility assessment of AI-generated video content. J. Inf. Sci. 2025. [CrossRef]
  44. Stamkou, C.; Saprikis, V.; Fragulis, G.F.; Antoniadis, I. User Experience and Perceptions of AI-Generated E-Commerce Content: A Survey-Based Evaluation of Functionality, Aesthetics, and Security. Data 2025, 10, 89. [Google Scholar] [CrossRef]
  45. Blei, D.M.; Ng, A.Y.; Jordan, M.I. Latent Dirichlet Allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]
  46. Chang, J.; Boyd-Graber, J.; Gerrish, S.; Wang, C.; Blei, D.M. Reading tea leaves: How humans interpret topic models. In Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009, Vancouver, BC, Canada, 7–10 December 2009; Curran Associates Inc.: Red Hook, NY, USA, 2009; pp. 288–296. [Google Scholar]
  47. Roberts, M.E.; Stewart, B.M.; Tingley, D. Stm: An R package for structural topic models. J. Stat. Softw. 2019, 91, 1–40. [Google Scholar] [CrossRef]
  48. Laureate, C.D.P.; Buntine, W.; Linger, H. A systematic review of the use of topic models for short text social media analysis. Artif. Intell. Rev. 2023, 56, 14223–14255. [Google Scholar] [CrossRef]
  49. Muthusami, R.; Mani Kandan, N.; Saritha, K.; Narenthiran, B.; Nagaprasad, N.; Ramaswamy, K. Investigating topic modeling techniques through evaluation of topics discovered in short texts data across diverse domains. Sci. Rep. 2024, 14, 1–13. [Google Scholar] [CrossRef]
  50. Bernhard-Harrer, J.; Ashour, R.; Eberl, J.M.; Tolochko, P.; Boomgaarden, H. Beyond standardization: A comprehensive review of topic modeling validation methods for computational social science research. Polit. Sci. Res. Methods 2025, 1–19. [Google Scholar] [CrossRef]
  51. Bertoni, E.; Fontana, M.; Gabrielli, L.; Signorelli, S.; Vespe, M. Handbook of Computational Social Science for Policy; Springer Nature: Berlin/Heidelberg, Germany, 2023; pp. 1–490. [Google Scholar]
  52. Lundberg, J. Towards a Conceptual Framework for System of Systems. In Proceedings of the Doctoral Consortium Papers Presented at the 35th International Conference on Advanced Information Systems Engineering (CAiSE 2023), CEUR Workshop Proceedings, Zaragoza, Spain, 12–16 June 2023; Volume 3407, pp. 18–24. [Google Scholar]
  53. Sartori, L.; Bocca, G. Minding the gap(s): Public perceptions of AI and socio-technical imaginaries. AI Soc. 2023, 38, 443–458. [Google Scholar] [CrossRef]
  54. Gorwa, R.; Binns, R.; Katzenbach, C. Algorithmic content moderation: Technical and political challenges in the automation of platform governance. Big Data Soc. 2020, 7. [Google Scholar] [CrossRef]
  55. Cinelli, M.; de Francisci Morales, G.; Galeazzi, A.; Quattrociocchi, W.; Starnini, M. The echo chamber effect on social media. Proc. Natl. Acad. Sci. USA 2021, 118, e2023301118. [Google Scholar] [CrossRef]
  56. Bakshy, E.; Messing, S.; Adamic, L.A. Exposure to ideologically diverse news and opinion on Facebook. Science 2015, 348, 1130–1132. [Google Scholar] [CrossRef]
  57. Leitch, A.; Chen, C. Unlimited Editions: Documenting Human Style in AI Art Generation. In Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 26 April–1 May 2025; Volume 1. [Google Scholar]
  58. Becker, C.; Conduit, R.; Chouinard, P.A.; Laycock, R. Can deepfakes be used to study emotion perception? A comparison of dynamic face stimuli. Behav. Res. Methods 2024, 56, 7674–7690. [Google Scholar] [CrossRef]
  59. Eiserbeck, A.; Maier, M.; Baum, J.; Abdel Rahman, R. Deepfake smiles matter less—The psychological and neural impact of presumed AI-generated faces. Sci. Rep. 2023, 13, 16111. [Google Scholar] [CrossRef]
  60. Brady, W.J.; Wills, J.A.; Jost, J.T.; Tucker, J.A.; Van Bavel, J.J.; Fiske, S.T. Emotion shapes the diffusion of moralized content in social networks. Proc. Natl. Acad. Sci. USA 2017, 114, 7313–7318. [Google Scholar] [CrossRef]
  61. Diel, A.; Lalgi, T.; Schröter, I.C.; MacDorman, K.F.; Teufel, M.; Bäuerle, A. Human performance in detecting deepfakes: A systematic review and meta-analysis of 56 papers. Comput. Hum. Behav. Rep. 2024, 16, 100538. [Google Scholar] [CrossRef]
  62. Groh, M.; Epstein, Z.; Firestone, C.; Picard, R. Deepfake detection by human crowds, machines, and machine-informed crowds. Proc. Natl. Acad. Sci. USA 2022, 119, e2110013119. [Google Scholar] [CrossRef] [PubMed]
  63. Pennycook, G.; Cannon, T.D.; Rand, D.G. Prior exposure increases perceived accuracy of fake news. J. Exp. Psychol. Gen. 2018, 147, 1865–1880. [Google Scholar] [CrossRef] [PubMed]
  64. Whittlestone, J.; Nyrup, R.; Alexandrova, A.; Cave, S. The Role and Limits of Principles in AI Ethics. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, New York, NY, USA, 27–28 January 2019; ACM: New York, NY, USA, 2019; pp. 195–200. [Google Scholar] [CrossRef]
  65. Crawford, K.; Calo, R. There is a blind spot in AI research. Nature 2016, 538, 311–313. [Google Scholar] [CrossRef]
  66. Singhal, A.; Neveditsin, N.; Tanveer, H.; Mago, V. Toward Fairness, Accountability, Transparency, and Ethics in AI for Social Media and Health Care: Scoping Review. JMIR Med. Inform. 2024, 12, e50048. [Google Scholar] [CrossRef]
  67. Jobin, A.; Ienca, M.; Vayena, E. Artificial Intelligence: The global landscape of ethics guidelines. arXiv 2019, arXiv:1906.11668. [Google Scholar] [CrossRef]
  68. Blei, D.M. Probabilistic topic models. Commun. ACM 2012, 55, 77–84. Available online: https://dl.acm.org/doi/10.1145/2133806.2133826 (accessed on 14 August 2025). [CrossRef]
  69. Griffiths, T.L.; Steyvers, M. Finding scientific topics. Proc. Natl. Acad. Sci. USA 2004, 101 (Suppl. S1), 5228–5235. [Google Scholar] [CrossRef]
  70. Blei, D.M.; Lafferty, J.D. Dynamic topic models. In Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; ACM: New York, NY, USA; Volume 148, pp. 113–120. [Google Scholar]
  71. Arora, S.; Ge, R.; Halpern, Y.; Mimno, D.; Moitra, A.; Sontag, D.; Wu, Y.; Zhu, M. A practical algorithm for topic modeling with provable guarantees. In Proceedings of the 30th International Conference on Machine Learning, PMLR, Atlanta, GA, USA, 17–19 June 2013; Volume 28, pp. 939–947. [Google Scholar]
  72. Grimmer, J.; Stewart, B.M. Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Polit Anal. 2013, 21, 267–297. [Google Scholar] [CrossRef]
  73. Bayılmış, O.Ü.; Orhan, S.; Bayılmış, C. Unveiling Gig Economy Trends via Topic Modeling and Big Data. Systems 2025, 13, 553. [Google Scholar] [CrossRef]
  74. Çallı, L.; Çallı, F. Understanding Airline Passengers during Covid-19 Outbreak to Improve Service Quality: Topic Modeling Approach to Complaints with Latent Dirichlet Allocation Algorithm. Transp. Res. Rec. J. Transp. Res. Board. 2022, 2677, 036119812211120. [Google Scholar] [CrossRef]
  75. Çallı, L. Exploring mobile banking adoption and service quality features through user-generated content: The application of a topic modeling approach to Google Play Store reviews. Int. J. Bank Mark. 2023, 41, 428–454. [Google Scholar] [CrossRef]
  76. Alma Çallı, B.; Ediz, Ç. Top concerns of user experiences in Metaverse games: A text-mining based approach. Entertain. Comput. 2023, 46, 100576. [Google Scholar] [CrossRef]
  77. Çallı, L.; Çallı, B.A. Value-centric analysis of user adoption for sustainable urban micro-mobility transportation through shared e-scooter services. Sustain. Dev. 2024, 32, 6408–6433. [Google Scholar] [CrossRef]
  78. Blei, D.M.; Lafferty, J.D. Topic models. In Mining Text Data; Srivastava, A., Sahami, M., Eds.; Springer International Publishing: Cham, The Netherlands, 2009. [Google Scholar]
  79. Lau, J.H.; Newman, D.; Baldwin, T. Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, EACL, Gothenburg, Sweden, 26–30 April 2014; pp. 530–539. [Google Scholar]
  80. Röder, M.; Both, A.; Hinneburg, A. Exploring the space of topic coherence measures. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, Shanghai, China, 2–6 February 2015; pp. 399–408. [Google Scholar]
  81. Mimno, D.; Wallach, H.M.; Talley, E.; Leenders, M.; McCallum, A. Optimizing semantic coherence in topic models. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, EMNLP, Edinburgh, UK, 27–31 July 2011; pp. 262–272. [Google Scholar]
  82. Wallach, H.M.; Mimno, D.; McCallum, A. Rethinking LDA: Why priors matter. In Proceedings of the 23rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 7–10 December 2009; pp. 1973–1981. [Google Scholar]
  83. Roberts, M.E.; Stewart, B.M.; Tingley, D.; Lucas, C.; Leder-Luis, J.; Gadarian, S.K.; Albertson, B.; Rand, D.G. Structural topic models for open-ended survey responses. Am. J. Pol. Sci. 2014, 58, 1064–1082. [Google Scholar] [CrossRef]
  84. Yin, R.K. Case Study Research and Applications: Design and Methods; Sage: Thousand Oaks, CA, USA, 2018. [Google Scholar]
  85. Bruns, A.; Burgess, J. Twitter hashtags from ad hoc to calculated publics. In Hashtag Publics: The Power and Politics of Discursive Networks; Peter Lang Inc.: New York, NY, USA, 2015; Volume 103. [Google Scholar]
  86. Newman, D.; Lau, J.H.; Grieser, K.; Baldwin, T. Automatic evaluation of topic coherence. In Proceedings of the Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Los Angeles, CA, USA, 2–4 June 2010; pp. 100–108. [Google Scholar]
  87. Murakami, R.; Chakraborty, B. Investigating the Efficient Use of Word Embedding with Neural-Topic Models for Interpretable Topics from Short Texts. Sensors 2022, 22, 852. [Google Scholar] [CrossRef]
  88. Davis, F.D. Perceived Usefulness, Perceived Ease of Use, and User Acceptance of. Manag. Inf. Syst. Res. Cent. 1989, 13, 319–340. [Google Scholar]
  89. Acemoglu, D.; Restrepo, P. Automation and new tasks: How technology displaces and reinstates labor. J. Econ. Perspect. 2019, 33, 3–30. [Google Scholar] [CrossRef]
  90. startuphub.ai. Sora 2: A Glimpse into Generative Video’s Uncanny Valley and Creative Frontier. 2025. Available online: https://www.startuphub.ai/ai-news/ai-video/2025/sora-2-a-glimpse-into-generative-videos-uncanny-valley-and-creative-frontier/ (accessed on 10 October 2025).
  91. Jenkins, H. Convergence Culture: Where Old and New Media Collid; New York University Press: New York, NY, USA, 2006. [Google Scholar]
  92. Reddit. Will Smith Eating Spaghetti—2.5 Years Later. 2025. Available online: https://www.reddit.com/r/ChatGPT/comments/1o22zh9/will_smith_eating_spaghetti_25_years_later/ (accessed on 10 October 2025).
  93. Gillespie, T. The politics of “platforms.”. New Media Soc. 2010, 12, 347–364. [Google Scholar] [CrossRef]
  94. Bischof, J.M.; Airoldi, E.M. Summarizing topical content with word frequency and exclusivity. In Proceedings of the 29th International Coference on International Conference on Machine Learning, ICML, Madison, WI, USA, 26 June–1 July 2012; Volume 1, pp. 201–208. [Google Scholar]
  95. Manning, C.D.; Raghavan, P.; Schütze, H. Introduction to Information Retrieval Choice Reviews Online; Cambridge University Press: New York, NY, USA, 2008; Volume 46. [Google Scholar]
  96. Maaten Lvan der Hinton, G. Visualizing Data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
  97. Hagen, L. Content analysis of e-petitions with topic modeling: How to train and evaluate LDA models? Inf. Process. Manag. 2018, 54, 1292–1307. [Google Scholar] [CrossRef]
  98. Neuendorf, K.A. The Content Analysis Guidebook, 2nd ed.; SAGE Publications, Inc.: Housand Oaks, CA, USA, 2017. [Google Scholar]
  99. Scott, W.A. Reliability of content analysis: The case of nominal scale coding. Public Opin. Q. 1955, 19, 321. [Google Scholar] [CrossRef]
  100. Krippendorff, K. Content Analysis: An Introduction to its Methodology; SAGE Publications, Inc.: Housand Oaks, CA, USA, 2013. [Google Scholar]
  101. Cohen, J. A Coefficient of Agreement for Nominal Scales. Educ. Psychol. Meas. 1960, 20, 37–46. [Google Scholar] [CrossRef]
  102. Landis, J.R.; Koch, G.G. The Measurement of Observer Agreement for Categorical Data. Biometrics 1977, 33, 159. [Google Scholar] [CrossRef]
  103. Braun, V.; Clarke, V. Using thematic analysis in psychology. Qual. Res. Psychol. 2006, 3, 77–101. [Google Scholar] [CrossRef]
Figure 1. An ABC-Based Model of User Reactions to AI-Generated Videos.
Figure 1. An ABC-Based Model of User Reactions to AI-Generated Videos.
Systems 13 00925 g001
Figure 2. Visual Summary of the Video Thumbnail Dataset.
Figure 2. Visual Summary of the Video Thumbnail Dataset.
Systems 13 00925 g002
Figure 3. Coherence Scores of Topics.
Figure 3. Coherence Scores of Topics.
Systems 13 00925 g003
Figure 4. Visualizing the Uncanny Valley: A Comparison of a Human Subject and a Sora 2 Digital Likeness [90].
Figure 4. Visualizing the Uncanny Valley: A Comparison of a Human Subject and a Sora 2 Digital Likeness [90].
Systems 13 00925 g004
Figure 5. Will Smith Eating Spaghetti [92].
Figure 5. Will Smith Eating Spaghetti [92].
Systems 13 00925 g005
Figure 6. Comment Distribution.
Figure 6. Comment Distribution.
Systems 13 00925 g006
Table 2. A Hierarchical Classification Framework for AI Video Comments (Definitions and Examples).
Table 2. A Hierarchical Classification Framework for AI Video Comments (Definitions and Examples).
CategoryDefinitionWhat to Look For in Comments?
Socio-Technical Systems and PlatformsThis category is not about the video itself, but about the tool that made it and the company that provides it.
What is a Platform? It is the service or website that delivers the technology to us (e.g., Google, YouTube, TikTok). Think of it as the stage where the content is presented.
Comments about the company: “Google is just after money again.”
Comments about the tool: “This AI is better than the other one.”
Comments about access: “Why doesn’t this work in my country?
AI-Generated Content and EstheticsThis category is about the video itself.
How does it look?
How does it make you feel?
Is it funny, beautiful, or creepy?
In other words, it is about our immediate, personal reactions while watching.
Comments about the visuals: “This looks so realistic!” or “The hands are messed up again.”
Comments about the feeling: “I love this!” or “This video is so uncanny.”
Comments about the humor: “The scene where the car melts is hilarious.”
Societal and Ethical ImplicationsThis category is about the “big picture”.
It is not about one person, but what this technology will do to all of society, our future, and our sense of truth.
Epistemic anxiety” is the fear that we can no longer trust our own eyes and ears.
Comments about truth/fakes: “We can’t trust any video anymore.” (perfect epistemic comment)
Comments about jobs: “Artists are going to lose their jobs.
Comments about safety and rules: “This needs to be banned!
Table 3. Conceptual Framework for Exploring User Experience of AI-Generated Video.
Table 3. Conceptual Framework for Exploring User Experience of AI-Generated Video.
ABC Model of Attitudes
Thematic CategoryAffective Dimension
(Feelings and Emotions)
Behavioral Dimension
(Actions and Intentions)
Cognitive Dimension
(Beliefs and Judgments)
AI-Generated Contentand EstheticsInstant, image-driven feelings (awe, unease, amusement).
(e.g., “That face is so real it gave me chills.”/“This scene was hilarious—I laughed out loud.”)
Active use or creative reuse of content (save, share, remix into memes). (e.g., “I downloaded it and made my own version for my story.”)Judging quality and authenticity (originality, craftsmanship, realism). (e.g., “Looks great—is it real or generated?”)
Socio-Technical SystemsandPlatformsEmotions toward the tech and companies (hope, suspicion, discomfort).
(e.g., “I don’t trust that company; they’re hiding things.”)
Practical steps to access or use the tech (subscribe, use VPN, switch platforms).
(e.g., “It’s paywalled—I used a VPN or moved to another site.”)
Strategic beliefs about platform motives and power (monopoly, profit-driven behavior).
(e.g., “The algorithm pushes this because it makes money from ads.”)
SocietalandEthical ImplicationsDeep feelings about broader consequences (fear, anxiety, distrust).
(e.g., “I’m scared—how can we trust video evidence anymore?”)
Civic or adaptive actions driven by norms (advocate for regulation, change sharing habits). (e.g., “I signed a petition and stopped sharing political clips.”)Core ethical judgments about society, truth, and labor (fairness, job displacement). (e.g., “Could this tech take creators’ jobs—is that fair?”)
Table 4. Topics.
Table 4. Topics.
TopicTitlesKeywordsTotal Comments
0Technical Support and Global Access to AI Toolstalking, exact, close, twenti, law, text1064
1AI’s Impact on Reality, Jobs, and Human Creativityreal, jobs, imagin (imagination), control, creat (create)929
2The Uncanny Nature of AI-Generated Contentai generated, light, nobodi, uncanni, details1240
3The ‘Will Smith Eating Spaghetti’ Meme Benchmarkwill_smith_spaghetti, prompt, audio, smith eating, dangerous1652
4Belief, Influence, and the Cost of AIbeliev, eye, influenc, ads, cooked, expens524
5AI in Music, Advertising, and Entertainmentmusic, ad, perfect, song, fun681
6Skepticism Towards AI Demonstrations (Google Veo)googl, veo, pay, tools, access395
7The Artificial Nature of AI and User Reactionscrazi (crazy), whats real, voic (voice), artifici (artificial), cant tell670
8Fear, Evidence, and Distrust in AIscare, evidenc (evidence), tell differ, flow, weird748
9Absurd and Comical AI-Generated Scenariosfunny, hand, background, war, soldier624
10AI’s Role in the Film and Entertainment Industryfilm_movi, act, creat, cost, industri (industry)514
11The AI Video Generation Arms Race (Veo vs. Kling)veo, kling, creat, art, generat617
12AI’s Depiction of Surreal and Mythical Contentlove, bigfoot, enjoy, lose, motion585
13Trust, News, and Technological Advancements in Mediacamera, cooked, news, trust, technolog590
14Commercial and Business Use of AI Softwarecommerci (commercial), idea, softwar (software), studi (studio), peopl ai585
Table 5. Topic Diagnostics Measurement.
Table 5. Topic Diagnostics Measurement.
TopicTitlesCoherenceTokensExclusivityCosine_Dist
0Technical Support and Global Access to AI Tools0.373317,9520.89690.9791
1AI’s Impact on Reality, Jobs, and Human Creativity0.791221,7350.67540.9697
2The Uncanny Nature of AI-Generated Content0.29816,1420.93360.9872
3The ‘Will Smith Eating Spaghetti’ Meme Benchmark0.416615,5650.93890.9862
4Belief, Influence, and the Cost of AI0.260915,6200.97440.9837
5AI in Music, Advertising, and Entertainment0.318813,9530.91430.9834
6Skepticism Towards AI Demonstrations (Google Veo)0.543616,4840.75960.9728
7The Artificial Nature of AI and User Reactions0.137915,2860.88070.9797
8Fear, Evidence, and Distrust in AI0.221614,8590.88430.9829
9Absurd and Comical AI-Generated Scenarios0.89717,8160.82960.9756
10AI’s Role in the Film and Entertainment Industry0.648830,3200.67260.9713
11The AI Video Generation Arms Race (Veo vs. Kling)0.737934,0030.86630.9553
12AI’s Depiction of Surreal and Mythical Content0.330213,5760.87370.9851
13Trust, News, and Technological Advancements in Media0.51516,2590.90520.9744
14Commercial and Business Use of AI Software0.317912,9420.9430.9885
Table 6. A Thematic Framework of Public Discourse on Generative AI.
Table 6. A Thematic Framework of Public Discourse on Generative AI.
CategoryRationaleIncluded Topics
Socio-Technical Systems and PlatformsThis category consolidates topics focused on the AI technologies themselves—the platforms, tools, and the competitive dynamics of their development and accessibility.0, 6, 11, 14
AI-Generated Content and EstheticsThis category groups topics that analyze the AI-generated media artifacts—their esthetic qualities, genre applications, cultural impact, and the audience’s interpretation of them.2, 3, 5, 7, 9, 10, 12
Societal and Ethical ImplicationsThis category encompasses the broad, societal consequences of generative AI, including debates on labor, truth, trust, fear, and the future of media.1, 4, 8, 13
Table 7. A Dynamic Framework of Public Attitudes: Integrating Thematic Layers with the ABC Model.
Table 7. A Dynamic Framework of Public Attitudes: Integrating Thematic Layers with the ABC Model.
ABC Model of Attitudes
Thematic CategoryAffective Dimension
(Feelings and Emotions)
Behavioral Dimension
(Actions and Intentions)
Cognitive Dimension
(Beliefs and Judgments)
AI-Generated Contentand EstheticsAwe and Unease: The visceral, gut-level reaction to the content itself.
Topic 2: The emotion of unease from the “uncanny valley.
Topic 9: The feeling of amusement from absurd glitches.
Topic 12: The emotion of enjoyment from surreal and mythical content.
Curation and Creation: The active engagement with and use of AI media.
Topic 3: The act of memetic participation using the ‘Will Smith’ benchmark.
Topic 5: The practice of integrating AI into cultural products like music and ads.
Technical Critique: The analytical evaluation of the artifact’s quality and authenticity.
Topic 7: The cognitive struggle to determine “what’s real.”
Topic 10: The critical judgment on AI’s disruptive role in the film industry.
Socio-Technical Systemsand PlatformsAnticipation and Skepticism: The emotional orientation toward the companies and tools.
This is a less dominant dimension, often linked to cognitive beliefs about the platform.
Adoption and Navigation: The tangible actions related to using and accessing technology.
Topic 0: The practical action of dealing with access barriers (VPNs, etc.).
Topic 11: The behavior of comparison in the “arms race” between tools.
Topic 14: The intended action of commercial and business use.
Strategic Evaluation: The formation of beliefs about platform motives and market dynamics.
Topic 6: The skeptical judgment of curated corporate demos.
Societaland Ethical ImplicationsFear and Anxiety: The deep-seated emotional response to AI’s potential societal harms.
Topic 8: The raw emotion of fear that evidence can no longer be trusted.
Advocacy and Adaptation: The intentions that arise from ethical concerns, driving calls for new behaviors.
While no single topic is purely behavioral, the cognitive beliefs below are the direct precursors to actions like calling for regulation or changing media consumption habits.
Worldview Formation: The core beliefs and judgments about AI’s fundamental impact.
Topic 1: The belief about AI’s impact on jobs and reality.
Topic 4: The judgment on AI’s power to influence and deceive.
Topic 13: The cognitive conclusion that trust in news media is collapsing.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Çalli, L.; Alma Çalli, B. Recoding Reality: A Case Study of YouTube Reactions to Generative AI Videos. Systems 2025, 13, 925. https://doi.org/10.3390/systems13100925

AMA Style

Çalli L, Alma Çalli B. Recoding Reality: A Case Study of YouTube Reactions to Generative AI Videos. Systems. 2025; 13(10):925. https://doi.org/10.3390/systems13100925

Chicago/Turabian Style

Çalli, Levent, and Büşra Alma Çalli. 2025. "Recoding Reality: A Case Study of YouTube Reactions to Generative AI Videos" Systems 13, no. 10: 925. https://doi.org/10.3390/systems13100925

APA Style

Çalli, L., & Alma Çalli, B. (2025). Recoding Reality: A Case Study of YouTube Reactions to Generative AI Videos. Systems, 13(10), 925. https://doi.org/10.3390/systems13100925

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop