1. Introduction
Social media platforms have become central to modern communication, serving as primary channels for information dissemination and public discourse. While they empower users to share diverse opinions, they also facilitate the rapid spread of negative sentiments expressions of dissatisfaction, anger, or criticism. The influence of these sentiments is profound, capable of impacting financial markets, destabilizing communities, and damaging professional reputations. For instance, a small minority of users can generate the vast majority of toxic content or misinformation [
1], creating a distorted and often hostile online environment.
Social media platforms allow people to discuss various topics, and popular topics receive more comments than others. Research shows two main reasons why people write angry comments online: they either copy the negative emotions they see from others, or they feel less responsible when many people are commenting [
2]. Both situations, having lots of negative comments or very few, can lead to more online arguments. Basically, reading negative content makes people feel bad emotions, which then makes them want to write mean comments themselves [
3]. Several studies describe negative sentiments as the process of verbal aggression in computer-mediated communication or the theoretical lens of emotional contagion [
4]. Users exposed to negative online environments are hypothesized to be more likely to engage in inflammatory behavior. The negative online environment includes harassment-related news articles, controversial public debate topics, and negative reader comments [
5]. Two boundary conditions will be examined that may moderate this effect: pre-existing beliefs of users and the volume of negative comments present. Regarding pre-existing beliefs in group contexts, research demonstrates that individuals construct their identities through affiliation with groups sharing similar attitudes, beliefs, and perspectives [
6]. People typically show sympathy toward opinions that align with their social identity and group membership, while expressing aggression or hostility toward out-group members with contradictory views [
7].
Current research has extensively documented the effects of negative sentiment, linking it to phenomena like emotional contagion [
8], online flaming [
9], and the manipulation of user behavior through algorithmic curation [
10]. Studies have shown that social media algorithms can create echo chambers, amplify radical content, and exploit cognitive biases to capture user attention [
11]. Furthermore, psychological factors such as the false consensus effect and pre-existing beliefs significantly shape how users perceive and engage with public opinion [
12].
Social networks utilize algorithms that curate personalized feeds, reinforcing user preferences. This design leverages variable rewards and social validation (likes, shares) to encourage habitual checking and prolonged engagement [
13]. Ultimately, this shapes user behavior by influencing what content they see, how they interact, and how much time they spend online. It directly manipulates users’ actions through two primary mechanisms: systematic manipulation and engineered addiction. Manipulation is hidden influence that covertly subverts decision-making by exploiting cognitive and emotional vulnerabilities, distinguishing it from overt persuasion or coercion, and demonstrates how platforms manipulate through deception (fake news and misleading advertising), tempting (perpetuating unrealistic beauty and success standards that create inauthentic desires), and inciting inappropriate emotional responses such as targeting low-income households with racially divisive content to elicit fear and loathing [
14].
However, a critical gap remains between observing these negative sentiment dynamics and proactively mitigating them at their source. Existing approaches often focus on detecting negative content after it has already spread or on analyzing its sufficiency conditions [
15]. There is a lack of a unified framework that identifies the fundamental antecedents of negative sentiment by modeling the expression capacity of users, their inherent and context-dependent ability to communicate and influence others within a network. This includes factors such as user temperament, the dynamic nature of event intensity, and the structural influence of key nodes, which are often overlooked.
The proposed ANSEC-MM model is evaluated against three state-of-the-art baselines:
EANN for event-based adversarial learning,
BERT for contextual language understanding, and
AOAN for aspect-oriented sentiment alignment.
This comprehensive comparison ensures robust validation across diverse sentiment analysis paradigms.
The existing baseline models EANN [
16], BERT [
17], and AOAN [
18] have made significant strides in sentiment detection and propagation modeling, but often overlooked critical aspects of social network dynamics. For instance, EANN focuses on domain-invariant features but neglects the intrinsic expressive behavior of users. BERT focuses solely on textual content without modeling social influence patterns. Similarly, AOAN excels in aspect-based sentiment alignment but fails to model the dynamic role of influencer nodes particularly the distinction between active and inactive influencers and their capacity for expression.
In response to this gap, this study proposes ANSEC-MM, a novel mixed-methods model for Identifying Antecedents of Negative Public Sentiment through Expression Capacity. The core contributions of this work are:
Introducing the Concept of Expression Capacity: The study moves beyond simple sentiment classification to model how a user’s ability and propensity to express themselves shaped by personality, social pressure, and event relevance serves as a key antecedent for negative sentiment propagation.
A Dynamic Influencer Node Framework: The model identifies and categorizes Influencer Nodes (both Active and Inactive) and introduces a Node Expressive Capacity (NE) metric to predict user interaction patterns that lead to negative sentiment.
A Proactive Mitigation Strategy: By integrating a Cognitive Effect Coefficient (φ) and Event Intensity (H), ANSEC-MM not only detects but also proactively mitigates the generation of negative sentiments through an algorithmic approach, distinguishing it from reactive state-of-the-art models.
The proposed model is validated against three benchmark datasets (ResearchGate, Zhihu, Sentiment140) and demonstrates superior performance compared to existing models like EANN, BERT, and AOAN.
The study decisively presents the following key contributions:
A Novel Modeling Framework: The proposed ANSEC-MM model gets straight to the point by introducing the ideas of Node Expression Capacity (NE) and the Cognitive Effect Coefficient (φ). These concepts are used to quantitatively show how factors like user personality, event intensity, and social context play a role in spreading negative sentiments.
A Dynamic Influencer Node Identification and Categorization Algorithm: A graph-theoretic algorithm has been developed that dynamically segments influencer nodes into Active and Inactive states over time. This allows for more precise and efficient identification of negative sentiment sources by focusing computational resources on nodes with high expressive potential, as defined by NE metric.
A Proactive Mitigation Mechanism: A mitigation algorithm is designed and validated, leveraging the identified negative influencer nodes and their expression capacity to proactively intervene. By replacing negative content within a predicted range R′ and imposing temporary restrictions, the model demonstrably reduces the prevalence and spread of negative sentiments, as evidenced by our experimental results showing up to 88% mitigation efficacy.
Comprehensive Empirical Validation against State-of-the-Art Baselines: The study conducts a rigorous evaluation of the proposed ANSEC-MM framework, directly comparing it against three established models EANN, BERT, and AOAN across multiple datasets: Zhihu, Sentiment140, and ResearchGate. The empirical results demonstrate the superior performance of proposed method, achieving high accuracy rates validating the practical efficacy of the introduced methodological advances.
Subsequent sections of this study are organized as: A comprehensive research gap in the previous studies is explored in
Section 2. Model design and its working mechanism are presented in
Section 3.
Section 4 presents the performance comparison with other models. In this section, the results are shown and critical discussion about each outcome is discoursed and justified. Findings are concluded in the
Section 5 and future research directions are given.
2. Literature Review
The rapid growth of social networks has transformed the way individuals communicate, share opinions, and engage with public discourse. However, alongside these benefits, the proliferation of negative sentiments ranging from hostility and misinformation to cyberbullying and toxic speech has emerged as a significant challenge. The study has utilized full potential to identify the underlying causes of such negative sentiments, highlighting factors such as anonymity, ideological polarization, algorithm-driven content amplification, and sociocultural tensions. Following studies are meticulously explored to identify the gaps and shortcomings that contribute to unacceptable negative sentiments.
Sahebi et al. [
19] provide a foundational theoretical analysis of how social media platforms undermine human autonomy through data control, attention capture, and behavioral manipulation. Their work synthesizes key themes like emotional contagion and algorithmic exploitation of psychological vulnerabilities, arguing compellingly for stronger ethical regulations. However, the study’s primary limitation is its reliance on theoretical and often outdated literature, lacking the empirical data and computational modeling needed to analyze real-time sentiment dynamics. This highlights a critical gap: while the impacts of platform design on autonomy are well-theorized, the field lacks operational models that can dynamically quantify and mitigate these effects, such as the node-level expression capacity and cognitive coefficients introduced in ANSEC-MM model.
Neubaum et al. [
20] investigated the psychological mechanisms underlying how users form perceptions of public opinion on social media, a process directly relevant to the formation of in-group and out-group dynamics that can fuel negativity. Through a two-session online experiment (N = 657), they demonstrated that a user’s pre-existing fear of social isolation can bias their attention to opinion cues, leading them to overgeneralize the sentiments expressed in a limited set of user comments to represent broader public consensus. This work is critical as it empirically links cognitive biases to the interpretation of social media content, showing how platform design can distort a user’s sense of the “opinion climate.” However, a significant limitation of this study is the small effect size of its findings, suggesting that the impact of isolated social media cues on opinion perception may be limited. Furthermore, the experimental design, which relied on a single exposure to static stimuli, cannot capture the long-term, cumulative effects of repeated algorithmic exposure to negative content in a dynamic feed. Therefore, while the authors ably identify a key micro-level cognitive mechanism, their approach does not scale to model the macro-level, dynamic propagation of sentiment through a network of influencer nodes over time a gap the proposed ANSEC-MM model addresses by explicitly modeling the temporal evolution of expression and influence.
Petit et al. [
21] provide experimental evidence for the psychological mechanisms that trigger negative online behavior. Their research specifically demonstrates how pre-existing beliefs and the volume of negative comments act as key catalysts for verbal aggression (“flaming”). Through a controlled (2 × 2 × 2) experiment, they found that the interaction between a user’s stance on a contentious issue (e.g., gun control) and exposure to a high volume of negative comments significantly increases negative emotional valence, which in turn mediates flaming behavior. This study is crucial for the proposed model as it empirically validates that user-specific traits (pre-existing beliefs) and environmental cues (comment volume) are central to negative sentiment propagation. However, its limitations—a small sample size (N = 156) and a static, lab-based environment—highlight a critical gap in the literature. While such studies expertly identify what factors cause negativity, they lack the methodology to track these dynamics at scale and in real-time within a live social network. The ANSEC-MM model addresses this by operationalizing similar concepts: the Cognitive Effect Coefficient (φ) can be seen as a computational analog to “pre-existing beliefs,” allowing proposed model to infer and account for user temperament dynamically across a vast network, far beyond the scope of a controlled experiment.
The behavior of bots and sentiments affecting human users specially information impacts the political decision-making process is explored by Yuriy Gorodnichenko et al. [
22] by undertaking Brexit referendum and the U.S. Presidential Election 2016. Their analysis, which employed time-series modeling on curated Twitter data, revealed that bots are highly effective at amplifying political messages and triggering rapid information diffusion within 1–2 h of key events. A key finding was that bots can be as influential as humans in energizing voters and exacerbating divisions through sentiment-laden content. This study critically demonstrates how network dynamics and automated actors can be leveraged to manipulate perceived public opinion and sway political outcomes. However, its approach relies on a binary classification of accounts as “bot” or “human,” a limitation that fails to capture the more nuanced, dynamic role of influencer nodes based on their expressive capacity and impact, which is a central focus of proposed ANSEC-MM model.
Xie et al. [
23] exemplify the user’s internal psychological response to negative online environments using the Stimulus–Organism–Response (SOR) framework. Their study on Weibo users demonstrates that external stimuli such as advertising interference, rumor dissemination, and information equivocality act as significant stressors. These stimuli trigger internal states of social media fatigue and perceived information overload within the user (the “Organism”), which in turn increase the intention to discontinue platform use (the “Response”). A key strength of their work is the hybrid methodological approach, combining Structural Equation Modeling (SEM) to establish net effects and Fuzzy-Set Qualitative Comparative Analysis (fsQCA) to reveal multiple complex pathways leading to discontinuance. This underscores the non-linear, multi-factorial nature of user reactions to negative content. However, the study’s focus is limited to platform discontinuation as the ultimate outcome and is constrained by its specific context (Weibo) and sample, leaving a gap in understanding how these negative internal states manifest within the platform not as exit, but as the production and propagation of further negative sentiment. This is a crucial distinction for proposed model addresses by focusing on mitigating negative output rather than just user churn.
Piko et al. [
24] contribute to the understanding of psychological drivers behind problematic social media use, identifying key personality correlates of addiction. Their study of Hungarian university students established that individuals with low self-esteem, a high fear of negative evaluation, and lower conscientiousness are significantly more prone to social media addiction, with the study noting higher propensity among female participants. Based on these findings, the authors appropriately recommend targeted educational interventions to promote responsible usage. However, the methodological limitations of their work highlight a specific research gap. The use of a non-representative, convenience sample and a cross-sectional design limits the generalizability of the findings and prevents any inference of causality. Furthermore, the employment of a brief personality scale and a binary gender construct reflects a lack of methodological nuance that can limit the study’s applicability to diverse, modern populations. Consequently, while this research effectively underscores the importance of psychological variables, its design confines it to identifying correlations rather than developing a dynamic model for predicting or mitigating negative behavior in real-world social networks.
While prior research has explored sentiment analysis, misinformation propagation, and behavioral influences in social networks, a consistent gap remains in modeling the nuanced interplay between user expression capacity and influencer node dynamics. Existing approaches lack a holistic mechanism to categorize influencers based on activity status, incorporate cognitive coefficients, or leverage expression capacity for sentiment mitigation. The ANSEC-MM model is designed to fill this gap by introducing a mixed-methods framework that unifies graph theory, cognitive modeling, and algorithmic mitigation strategies.
3. Proposed Model
The research methodology centers on examining how negative public sentiment evolves in the social media landscape. For this model, overall 250 significant public incidents have been identified and collected that occurred between January 2024 and March 2025. The sample deliberately includes a diverse range of events: those that generated constructive public discourse alongside incidents that sparked widespread concern and controversy. Through careful screening, important cases were isolated that demonstrated clear patterns of negative public sentiment and were organized chronologically. This systematic arrangement formed the foundation of proposed analytical framework, creating a robust case database specifically tailored for investigating contemporary social opinion dynamics.
3.1. Formal Definition of Core Constructs
Following key variables are introduced for the development of proposed model.
Node Expressive Capacity (NE): A discrete integer value between 0 and 10, representing the predicted maximum number of sentiment-carrying interactions a user node can initiate within a single propagation time step, t. It is a function of the user’s intrinsic traits and the prevailing social context.
Cognitive Effect Coefficient (φ): The personal character parameter (€) is derived from user’s historical posting frequency and engagement reciprocity, while the opinion difference calculation uses Jaccard similarity on topic-interaction vectors.
Collective Expression Parameter (η): A system-level hyperparameter, tuned on a per-dataset basis to reflect the overall activeness of the social network such as lower η for forums like ResearchGate, higher η for platforms like Twitter/Sentiment140.
In response to the limitations of existing sentiment analysis models such as their inability to dynamically classify influencer nodes, incorporate expression capacity, or model cognitive effects this study introduces the ANSEC-MM framework. The model is designed to identify and mitigate negative sentiment through a novel three-phase approach centered on expression capacity. Unlike prior methods, ANSEC-MM explicitly differentiates between active and inactive influencer nodes, incorporates node expressive capacity (NE), and models cognitive and temporal factors to better capture and control sentiment propagation.
Considering Expression Capacity, the ANSEC-MM model introduces the concept of Influencer nodes identification in social networks. This is achieved by identification and targeting of initial influencer nodes along with their activation status as illustrated in
Figure 1. Among the influencer nodes it further identifies the Active and Inactive influencer nodes. Next is the triggering of sentiments formation which separates the negative and the positive influencer nodes; the model will consider only negative influencer nodes. The last stage models the evolving dynamics of negative influencer nodes to shift the sentimental influencer over time and eventually, negative sentiments are in controlled status.
Initially, the suitable influencer nodes are identified within the comprehensive social events database, which uses structured sampling approaches such as ResearchGate, Zhihu, Sentiment140 and so on, serving as origin points from which sentiments diffuse progressively through successive network layers. Throughout this diffusion process, sentiments undergo continuous temporal transformation. To optimize target sentiment outcomes, iterative approaches are typically employed until the propagation process reaches equilibrium. The primary emphasis centers on enhancing two critical elements: strategic identification of initial influencer nodes and accurate modeling of temporal sentiment dynamics. The framework operates under the assumption that individual influencer nodes possess limited capacity for sentiment expression. The comprehensive operationalization of multi-dimensional determinants is systematically documented in
Table 1 [
25], which provides detailed specifications for each sentiment variable category, measurement approaches, and theoretical justifications. This sentiment variable architecture forms the empirical foundation for examining how conjunctural relationships between different factors drive varying intensities of public sentiment formation in current hyperconnected information landscape.
In order to categorize the negative and non-negative sentiment correctly, the study employed a Necessary Condition Analysis (NCA) [
26] to 12 potential determinants and identified which individual sentiment variable functioned as critical prerequisites for achieving high public sentiment intensity in digital environments. NCA represents an advanced methodological approach that evaluates necessity relationships by determining whether specific conditions must be present for an outcome to occur, complementing traditional sufficiency focused analyses [
27].
The NCA framework operates on the principle that necessary conditions create “bottlenecks” in causal pathways, if these conditions are absent or insufficient, the desired outcome cannot materialize regardless of other contributing factors [
28]. This approach is valuable for understanding viral content dynamics, where certain threshold conditions may be essential for information to achieve widespread circulation across social media ecosystems. The proposed ANSEC-MM model performs a comprehensive analysis and evaluated each of the 12 variables across multiple necessity metrics, including consistency scores, coverage indices, and effect sizes, to determine their individual contribution to sentiment formation intensity. The systematic results of this necessity assessment are comprehensively documented in
Table 2, which provides detailed consistency and coverage statistics for each examined sentiment variable.
Through systematic calibration of fuzzy-set membership scores and rigorous evaluation of solution coverage metrics, consistency thresholds, and unique coverage parameters, this analysis uncovered three distinct configurational archetypes that characterize viral sentiment formation processes. Contemporary fsQCA [
23], methodology emphasizes the importance of examining both intermediate and parsimonious solutions to ensure robust causal inference, while also considering necessity-in-kind relationships that may exist within sufficient configurations.
The identified configurations represent empirically grounded typologies of how different combinations of causal conditions synergistically produce high-intensity opinion diffusion in current fragmented social network landscape. Each configuration illuminates a distinct causal pathway where specific conjunctions of factors such as content characteristics, network structures, temporal dynamics, and platform features interact to create viral cascades through different underlying mechanisms.
These three prototypical patterns, comprehensively detailed in
Figure 2, with their respective consistency scores, raw coverage, and unique coverage statistics, demonstrate how conjunctural causality operates in digital sentiment ecosystems. The configurations reveal that viral diffusion emerges through multiple alternative pathways rather than a single universal model, reflecting the heterogeneous nature of information propagation across different event types, audience segments, and platform contexts in contemporary social media environments.
The study includes an investigation case of a school fire incident as an illustrative exemplar of viral diffusion dynamics. The case selection methodology is grounded in two fundamental criteria that ensure analytical rigor and theoretical generalizability. The selected case demonstrates optimal alignment with the three critical causal conditions such as accountability attribution ambiguity, cross-platform media amplification, and high-engagement influencer activation. This convergence ensures the case captures representative patterns of digital sentiment formation processes while providing sufficient empirical depth for mechanism level analysis. By deliberately excluding cases involving complex institutional actors, this selection isolates the pure dynamics of public interest driven viral diffusion. This eliminates confounding sentiment variables related to organizational crisis communication strategies and allows for cleaner examination of grassroots sentiment formation mechanisms.
The event generated complex attribution dynamics involving school administration, regulatory oversight bodies, student communities, and parent networks. This multi-actor responsibility landscape created fertile conditions for sustained online discourse, as different stakeholder groups advanced competing narratives about causality and accountability. The case demonstrates a bifurcated influencer response pattern characteristic of modern viral events known as Dual-Track Influencer Ecosystem Activation creates a feedback loop where analytical content provided legitimacy for emotional responses, while grassroots amplification sustained attention spans necessary for viral persistence across multiple social media platforms [
29].
3.2. Dynamic Segmentation of Viral Event Lifecycles
To systematically analyze the temporal dynamics of viral information propagation and map the evolutionary pathways of public engagement intensity, the model employs a data driven approach to segment high impact social events into distinct developmental phases as illustrated in
Figure 3. The analytical framework utilizes Baidu Search Index analytics [
30], as the primary temporal indicator, supplemented by cross platform engagement velocity metrics to capture the comprehensive digital attention lifecycle.
3.3. Viral Event Lifecycle
Considering the example of a school fire incident by taking its digital trajectory initiated in January 2025, for example, according to Baidu Search Index analytics, this rapidly escalated to peak public engagement within 24 h in the first week of January, demonstrating the characteristic exponential growth pattern of contemporary viral events where algorithmic amplification mechanisms and cross-platform content syndication create compressed attention cycles that concentrate massive public focus into narrow temporal windows, reflecting how modern digital ecosystems can transform local incidents into national discourse phenomena through accelerated information cascades. Following peak attention intensity, the incident entered a complex secondary propagation phase characterized by moderate decline in primary engagement metrics while maintaining elevated discussion volumes through derivative content creation, fact checking activities, and unfortunately, misinformation proliferation networks that exploited the information vacuum and emotional intensity surrounding the tragedy, illustrating how contemporary viral events generate sustained engagement through multi-layered discourse ecosystems where original content spawns countless interpretations, analyses, and unfortunately manipulated narratives that extend the event’s digital lifespan beyond initial news cycles.
The final temporal segment witnessed progressive normalization of discussion intensity as public attention migrated toward newer events, yet the incident maintained persistent baseline engagement through archival references, commemorative posts, and periodic resurgence triggered by related safety incidents. Considering all above situations, the proposed ANSEC-MM model can handle all deep bidirectional contextualization to capture nuanced sentiment patterns within social network discourse, particularly excelling at detecting subtle emotional shifts and implicit negative sentiment expressions that traditional lexicon-based approaches often miss in contemporary digital communication where users employ sophisticated linguistic strategies including sarcasm, irony, and coded language to express criticism and dissatisfaction. The bidirectional attention mechanism of the model enables simultaneous processing of contextual information from both preceding and succeeding text segments, allowing for sophisticated understanding of sentiment-bearing linguistic constructions that depend on broader conversational context rather than isolated keyword analysis.
The core computational framework of the model consists of multi-layered transformer encoder blocks, each incorporating self-attention mechanisms that dynamically weight the importance of different textual elements, feed-forward neural networks that process contextualized representations, and layer normalization components that stabilize training dynamics across deep network architectures. This sophisticated architecture enables the model to capture complex linguistic phenomena including negation handling, sentiment modifier effects, and domain-specific expression patterns that are particularly crucial for analyzing public sentiment evolution in crisis communication contexts where emotional intensity and linguistic creativity often exceed standard sentiment analysis challenges.
3.4. Model Training
The proposed ANSEC-MM model undergoes specialized fine-tuning procedures tailored for sentiment analysis, involving the integration of task-specific classification layers and training on carefully curated sentiment baseline datasets such as Zhihu, Sentiment140 and ResearchGate, is considered as a social research forum as well a research dataset. The online social forums capture the unique linguistic characteristics of online public discourse during crisis events. This adaptive training approach ensures optimal performance for detecting negative sentiment evolution patterns, emotional escalation trajectories, and sentiment polarity shifts that characterize public sentiment dynamics during high-impact social incidents, enabling precise tracking of how collective emotional responses develop and transform across different temporal phases of viral event lifecycles.
In order to identify the negative sentiments in the social network, the model undertake the initial influencer nodes as graph theoretic problem focused on identifying optimal initial influencer nodes and their corresponding activation strategies. The social network has been considered as , where are the influencer nodes. Further, represents the set of directed edges linking nodes, indicating relationships between users. represents edge weights indicating the degree of user proximity, with spanning from −1 to 1, signifying that relationships progressively transition from antagonistic to intimate.
3.5. Parameter Estimation and Model Validation
Data-Driven Parameter Estimation: The parameters are introduced such as attenuation coefficient (α) available in Equation (5) are not arbitrary but are estimated from empirical data decay curves of real-world event lifespans on social media.
Sensitivity Analysis: The Sensitivity Analysis has been reported for key parameters such as η, α. to demonstrate how they influence model output, establishing the robustness of findings.
Baseline Comparison: The outcome of the proposed model would be validated against the ground-truth labels in datasets like Zhihu, Sentiment140, and ResearchGate to calculate the accuracy metrics reported in the study.
3.6. Working with Expression Capacity
Expression Capacity describes the ability of a user to interact with their neighbors. The proposal of this concept takes into account the following factors:
Every individual has personality traits that influence their behavior. For example, introverts may be reluctant to express their sentiments, which may lead them not to react. People may choose to remain inactive due to social pressure from authority or conformity.
The spread of sentiments has finite duration; it eventually encounters a cessation threshold. This happens as the relevance of an event slowly fades, causing diminishing user interest over temporal periods.
Inactive nodes can influence the choice of initial nodes and the diffusion of information.
For the proposed model, the social network societies are divided into active and inactive societies. An inactive society denotes a community where interpersonal communication is comparatively limited, and users may opt to withhold their viewpoints due to societal constraints or individual personality characteristics. In contrast, an active society describes one characterized by frequent individual communication, where the public demonstrates greater willingness to share diverse viewpoints and information.
Varying societal structures can influence the propagation of target sentiments. For instance, in an inactive society, the percentage of individuals voicing strongly favorable or unfavorable viewpoints may remain comparatively minimal. Simultaneously, the population may demonstrate heightened vulnerability to conformist pressures, resulting in uniform sentiment patterns. Therefore, a variable € has been introduced that describes the impact of this social environment.
Further, a node expressive capacity (NE) has been introduced as Equation (1). This represents the range of the integers from 0 to 10 and predicts the number of times a user can interact with surrounding influencer nodes in the current environment. The cognitive effect coefficient, φ, is defined by the individual’s temperament and upbringing environment given in Equation (2). Event intensity, H, represents the level of focus an incident generates, while the collective expression parameter, η, is formulated to account for the mentality during user communication. The overall equation represents the anticipated interaction rate of user vertices with neighboring vertices under the impact of multiple variables, enabling the identification of inactive influencer nodes.
€ is the personal character that describes the intrinsic cognitive characteristic of the users. Considering € ∈ [0, 1], where 0 reflects a user who is maximally private and 1 reflects a user who is maximally social. Randomly assigned values between 0 and 1 are employed to simulate diverse personality traits. Similarly,
constitutes an upstream neighbor of vertex
,
quantifies the viewpoint gap between vertices
and
, and k represents the size of the upstream neighborhood for vertex
.
Taking help from newton’s law of cooling to demonstrate the Event Intensity
in Equation (3) that shows the users’ decreasing attention as an event loses attention.
The temperature of an object at time
is defined represented by
, that highlights the intensity of the event at the time. Rate of change in intensity is represented by
. Here,
signifies the fixed environmental temperature, indicating the event’s intensity
and α serves as the attenuation coefficient, measuring the velocity of intensity decline. After solving Equation (3), the new Equation (4) is yielded and
indicates the constant term obtained via the differential equation solution methodology.
When
the heat intensity of the event drops to zero, that is,
. At the initial time
, the heat intensity of event reaches 1, specifically,
. Applying these boundary conditions to Equation (4) results in C = 1 and
, leading to the definitive equation for the intensity of the event illustrated in Equation (5).
The collective expression parameter η is employed to quantify the maximum communicative potential of most individuals within a particular social context. η oscillates around a central value of 10 and enables the modeling of interactive and dormant communities. η tends to be greater in an active society, demonstrating increased opinion and perspective sharing among members, whereas in a reserved community, η tends to be smaller, demonstrating limited dialog and viewpoint expression.
3.7. Selection of Influencer Nodes
In the first stage of influencer node selection, both the sentiment inclination and the influence of the nodes are considered to maximize desired outcomes. Afterward, key characteristics of sentiment propagation have been identified as
- i.
Sentiments at first expand exponentially
- ii.
Expansion of active influencer nodes ceases following 5–10 propagation cycles, leading to a more scattered distribution of inactive influencer nodes
It is concluded that mass selection of all influencer nodes or a homogeneous selection strategy is inadequate. Therefore, the proposed model has been incorporated: a shrewd algorithm which employs a multi-stage influencer node methodology with varied tactics to amplify the dissemination of target sentiments. It executes sequentially, allocating influencer nodes throughout several phases to prevent local optimization traps. This algorithm fosters enhanced adaptability and strategic versatility in desired sentiment propagation. The algorithm begins by segmenting a directed graph of a social network into distinct communities. It deploys
influencer nodes across
rounds and plants
influencer nodes within each round. To isolate key communities for streamlined calculation, it selectively maintains communities with influencer node populations above 50. In the opening phase, influence node distribution corresponds to community dimensions and utilizes Equation (6) to determine desired outcomes. The propagation process reaches its second round, and the influencer nodes are calculated using Equation (7).
and
extract the sentiments of influencer node
and
in both equations, respectively.
is an active influencer node among the in-degree nodes of influencer node
. Here
indicates the weights of the directed links from
to
. During the initial stage of the dissemination process, the significance of influential nodes has been underscored by employing Equation (6) to calculate their desired returns. With the advancement of the propagation process, inactive influencer nodes become more scattered. The complete process is described in Algorithm 1 and illustrated in a phase-wise manner in
Figure 4.
| Algorithm 1. Dynamic Influencer Node Identification and Categorization |
| 1: | Procedure: Network G = (V, E, W), the number of influencer nodes, k, the number of sowing stages, l, and stage time, T, collective expression parameter η, attenuation coefficient α, active status threshold θ_active |
| 2: | Initialize: S ← ∅, influencers_per_stage ← k/l, time ← 0 For each node i ∈ V: Initialize opinion o_i, set event intensity H_i ← 1.0 |
| 3: | Detect Communities: communities ← LOUVAIN(G) |
| 4: | For stage r = 1 to l do: |
| 5: | For each node i ∈ V do: H_i ← e^{−α × time} from Equation (5), = set of in-neighbors of node i in the directed graph G. φ_i ← €_i × (1/: Calculate Cognitive Effect Coefficient φ_i |
| 6: | For each node i ∈ V do: If NE_i ≥ θ_active then Node_Status[i] ← Active Else Node_Status[i] ← Inactive |
| 7: | If r == 1 then: stage_influencers ← ∅ For each community C ∈ communities do: allocation ← influencers_per_stage × (|C|/|V|) |
| 8: | Else: inactive_candidates ← {I; i ∈ V, Node_Status[i] = Inactive, i ∉ S} inactive_candidates by P2(i) influencers_per_stage nodes from inactive_candidates, add to S |
| 9: | Run sentiment propagation from nodes in S for T time steps |
| 10: | time ← time + T |
| 11: | Return S, Node_Status |
| 12: | end Procedure |
Phase 1 encompasses partitioning the community and determining the influencer nodes for the initial round using Equation (6). Phase 2 reveals that when the diffusion process enters the second round and influencer nodes are computed using Equation (7), active nodes already exist within the social network, encompassing both positive and negative sentiments nodes. Phase 3 demonstrates the diffusion process iteratively and influences at later stages until the influencer node allocation is depleted. During information exchange, the sentiments may change dynamically. This framework delivers a more accurate modeling of sentiment evolution in social networks, considering both personal convictions and selective engagement driven by sentiment compatibility.
3.8. Mitigation of Negative Sentiments
When a negative comment/opinion
n is created by any person about any person, the negative comment is filtered out from the pre-defined range of comments
R. If it matches, it is immediately replaced with a neutral comment
¢, and the person who posted the negative comment is restricted for a specific time period
, from posting further comments. If another person retaliates with negative comments, the same process would be adopted. Let the negative comment was generated at time
, and the probabilities of its being negative be
. Considering influencer nodes from graph theory in the previous section
. In order to estimate the range of negative sentiments, Algorithm 2 arranges the Negative influencer nodes
, with Range
, on graph theory network G as defined in Equation (8).
The probability of generating a negative influencer is lower than initially expected during range estimation; therefore, the proposed model is capable enough now to handle the frequency of negative sentiments.
| Algorithm 2. Proactive Sentiment Mitigation Via Nudge-and-Boost |
| | ![Data 10 00203 i001 Data 10 00203 i001]() |
| | Procedure: |
| 1: | For each negative influencer node s ∈ S_negative do: a. Identify Amplification Targets: Within the mitigation range R′, identify Active nodes with high centrality that are connected to s. These are the primary targets for content promotion. |
| 2: | Boost Constructive Content: a. Select Content: For the topic associated with the negative sentiment from s, select a relevant, constructive post C from the Library_Constructive_Content. b. Amplify Reach: Algorithmically increase the visibility of post C in the feeds of the Amplification Targets identified in Step 1. This can be simulated by increasing its weight in the feed ranking algorithm. |
| 3: | Nudge Repeat Offenders: a. Monitor: For each user u in R′, maintain a count negative_count[u] of their posts classified as negative. b. Check and Intervene: If negative_count[u] >= Nudge_Threshold (N_t) then: * Trigger Educational Nudge: Temporarily restrict u’s ability to post until they interact with a short, educational module on constructive communication and digital literacy. * Reset: negative_count[u] ← 0//Reset counter after nudge. |
| 4: | Log and Update: Record all boosted content and nudged users. Update the network state for the next monitoring cycle. |
| 5: | Return Action_Log. |
3.9. Experimental Setup and Data Preprocessing
In order to ensure a fair and reproducible evaluation across the multilingual datasets Zhihu in Chinese, Sentiment140 in English, and ResearchGate, which is primarily in English, a consistent and language appropriate preprocessing pipeline was employed as:
3.9.1. Model Architecture and Cross-Lingual Handling
Given the multilingual nature of the data, the proposed ANSEC-MM model utilized XLM-RoBERTa (XLM-R) [
31], as its pretrained textual backbone. XLM-R is a transformer-based model specifically designed for cross-lingual understanding, having been trained on 100 different languages. This allowed to use a single, unified model to process both English and Chinese text without the need for separate language-specific models, ensuring that performance differences are attributable to the proposed algorithm rather than inconsistencies in the underlying text encoder.
3.9.2. Language-Specific Preprocessing
The following preprocessing steps were applied to each dataset:
For Sentiment140 and ResearchGate (English): Text was lowercased, and URLs, user mentions (@user), and non-alphanumeric characters were removed. Tokenization was performed using the XLM-R tokenizer’s built-in subword tokenizer, which is based on SentencePiece. Emojis were converted to their textual descriptions, (like
![Data 10 00203 i002 Data 10 00203 i002]()
-> “:slightly_smiling_face) using the demoji Python library to preserve sentiment information.
For Zhihu (Chinese): A critical step for Chinese NLP is word segmentation. Therefore, Jieba library [
32], was used to perform accurate word segmentation on the raw text. The segmented text was then tokenized using the same XLM-R tokenizer. It is important to note that XLM-R expects pre-tokenized input for Chinese, which the Jieba segmentation provides. URLs and non-essential punctuation were removed. Emojis were handled identically to the English datasets.
3.9.3. General-Specific Preprocessing
These pre-processing are allied to all datasets.
No language-specific stopword lists were removed, as modern transformer models like XLM-R can often infer the importance of words from context. Furthermore, stopwords can carry sentiment in certain contexts such as This is not good.
All texts were truncated or padded to a maximum sequence length of 128 tokens to create uniform input dimensions for the model.
This standardized, yet language-aware, preprocessing pipeline ensures that the textual data from all sources was in a compatible format for the cross-lingual XLM-R model, thereby guaranteeing the comparability and reproducibility of the reported results.
4. Performance and Results
The performance of the ANSEC-MM model is evaluated against three state-of-the-art baselines: EANN, BERT, and AOAN. The chosen baselines represent the current state-of-the-art in relevant domains. BERT provides strong textual understanding capabilities for sentiment classification while EANN and AOAN offer targeted solutions for specific sentiment analysis challenges. Unlike these models, ANSEC-MM incorporates expression capacity and influencer node dynamics, enabling more precise detection and mitigation of negative sentiments. The following experiments demonstrate how these novel components contribute to improved accuracy and robustness across multiple datasets.
4.1. Evaluation Protocol and Metrics
Sentiment Detection Accuracy was evaluated as a standard classification task. The model was tasked with classifying user posts into negative or non-negative categories. The ground truth was provided by human-annotated labels on a held-out test set comprising 100 posts from each dataset. Detection Accuracy is the percentage of correctly classified instances against this human-annotated ground truth.
Evaluation Protocol: In order to evaluate the performance of the proposed model, a five-fold cross-validation strategy was adopted. The data from each dataset was randomly shuffled and split into five folds, with four folds used for training and one fold for testing in each iteration. The reported results are the average across all folds. The trained model was applied to the test set to obtain performance metrics. This process was repeated five times, with each fold used exactly once as the test set.
Mitigation Accuracy is a more complex metric designed to evaluate the efficacy of proposed intervention mechanism. It is not a classification accuracy but a performance metric for the mitigation algorithm. It can be understood as:
Its objective is to measure the success of models in preventing the propagation of negative sentiments. The sentiment propagation process is simulated twice as:
Baseline Simulation: As the mitigation process proceeds, the model is measured to determine the final number of negative nodes, which is represented as N_baseline.
Mitigation Simulation: It is the percentage reduction in the spread of negative sentiments. The model continues with the mitigation algorithm active and again measures the final number of negative nodes denoted as N_mitigated. Finally, it yields
This metric quantifies how effectively proposed intervention contains the negative sentiment cascade. A higher percentage indicates a more successful mitigation, with 100% representing a complete containment of the spread.
The results reported in
Section 4.2 and further sections represent the average performance across all test runs. Further, description and preprocessing statistics for the datasets used in the experiments are provided in
Table 3.
4.2. Impact of Expression Capacity on Performance Comparison of Influencer Nodes
The performance of influencer nodes for identification of negative sentiments has been tested and analyzed with Expression Capacity, and without Expression Capacity by utilizing three datasets, i.e., Zhihu, Sentiment140 and Research Gate. Each experiment is applied on every dataset individually, and the performance of each model is meticulously analyzed.
4.2.1. ResearchGate with and Without Expression Capacity
The analysis of ResearchGate data reveals distinct performance patterns under different conditions. As shown in
Figure 5a, with expression capacity, ANSEC-MM achieved the highest performance (0.93), followed closely by BERT (0.90), demonstrating the value of specialized design and advanced language modeling. The outcome without expression capacity depicted in
Figure 5b shows that all models regressed, but BERT (0.83) showed the greatest resilience, nearly matching ANSEC-MM (0.82). This highlights the robustness of BERT and the significant performance cost of removing expressive features, with EANN and AOAN consistently trailing in both scenarios.
Similarly, with expression capacity ANSEC-MM identified the most influencer nodes (110), showcasing its superior detection capability, while BERT also performed strongly (102 nodes) depicted in
Figure 6a. Further, without expression capacity shown in
Figure 6b, a significant reduction occurred across all models; BERT (87 nodes) and ANSEC-MM (82 nodes) were most affected in absolute terms, though BERT maintained a strong lead over EANN. The results confirm expression features are critical for maximizing influencer node discovery, with ANSEC-MM benefiting most from their inclusion.
4.2.2. Zhihu with and Without Expression Capacity
In the scaled Zhihu analysis, ANSEC-MM achieved the maximum 120 influencer nodes with expression capacity, while BERT reached 110 nodes illustrated in
Figure 7a. Without expression capacity, both dropped to 73 and 75 nodes, respectively, evidenced in
Figure 7b. BERT consistently outperformed EANN by 41–67% across conditions. The scaled results maintain the same performance hierarchy as the original data, confirming ANSEC-MM’s superior influencer identification and BERT’s strong competitive performance on the Zhihu platform, with expression capacity remaining crucial for optimal results.
Similarly, the scaled Zhihu data shows exceptional growth of ANSEC-MM in
Figure 8a, increasing 208% from 39 to 120 nodes over five weeks with expression capacity. BERT demonstrated strong growth from 45 to 110 nodes (144% increase). Without expression capacity, growth rates slowed significantly across all models shown in
Figure 8b. ANSEC-MM maintained its leadership position while BERT showed more consistent cross-condition performance. The scaled trends confirm the original patterns, emphasizing critical role of expression capacity in accelerating influencer discovery on knowledge-sharing platforms.
4.2.3. Sentiment140 Without Expression Capacity
On Sentiment140, ANSEC-MM achieved the highest performance with 115 influencer nodes with expression capacity, outperforming 110 nodes of BERT illustrated in
Figure 9a. Without expression capacity, BERT tried to maintain its position with 82 nodes versus 78 nodes of ANSEC-MM given in
Figure 9b. BERT demonstrated 28–44% improvement over EANN across conditions. The sentiment analysis dataset highlights the natural advantage of BERT in processing emotional content, allowing it to perform slightly better at some points than the specialized ANSEC-MM model without expression capacity for influencer identification tasks.
Similarly, Sentiment140 data revealed consistent leadership of ANSEC-MM, growing from 38 to 115 nodes with expression capacity shown in
Figure 10a. BERT showed strong growth from 32 to 110 nodes. Without expression capacity, growth rates decreased significantly but BERT maintained its advantage depicted in
Figure 10b. The sentiment-rich environment of Sentiment140 plays to strengths ANSEC-MM in natural language understanding, enabling it to outperform BERT. This contrasts with other datasets where ANSEC-MM typically leads, highlighting dataset-specific model performance variations.
4.3. Negative Sentiments Detection
Under a different number of influencer nodes, the negative sentiments have been identified by the proposed ANSEC-MM model and compared with EANN, BERT and AOAN models illustrated in
Figure 11. It can be observed that proposed model showed a significant advantage across all datasets regarding how negative sentiment arises when conversation begins with the participating influencer nodes taking priority data. It also can be seen that at influencer node 55, the proposed model detects much better than in blame game opinion than EANN. Similarly, it showed enhancements at influencer node 105 to 50% and reached up to 90% when reaching at 255 influencer nodes defined by the evaluation protocol in
Section 4.1.
Although it faced hindrance at influencer node 155, and models EANN, BERT, and AOAN seemed very much balanced, they later soon dispersed. This resulted from the fact that BERT and AOAN incorporate both individual opinions and node influence when choosing influencer nodes, yet they do not completely account for the sentiments of the neighboring nodes. Conversely, while EANN takes into account the potential impact of neighboring nodes, it overlooks the influence of the nodes themselves. This could result in choosing influencer nodes that hold the desired opinions but possess limited influence, thus impeding the rapid dissemination of desired opinions. At this stage, proposed model adopts a comprehensive strategy that considers node sentiments, their influence levels, and neighboring node sentiments during influencer selection. Simultaneously, it distributes influencer nodes according to community size, guaranteeing swift detection of negative sentiments to create a collective effect and achieve social consensus. Therefore, the proposed model can even detect negative sentiment more correctly and demonstrate stable and excellent performance across different social networks.
The core classification metrics are compared in
Figure 12, unequivocally demonstrating the superior performance of the ANSEC-MM model. The results show a significant performance gap, with the proposed model achieving a peak F1-score of 0.90, substantially outperforming the strongest baseline by over 18%. This evidence confirms the efficacy of integrating influencer node dynamics and expression capacity for robust sentiment classification. Similarly, ANSEC-MM model demonstrated superior specificity by maintaining substantially lower false positive rates across all recall thresholds, as evidenced in
Figure 13. At 90% recall, ANSEC-MM achieves an FPR of only 0.12, significantly outperforming baseline models. This indicates robust discrimination capability while maintaining high sensitivity in negative sentiment detection.
Further, precision-recall trade-off analysis demonstrates superior capability of ANSEC-MM, maintaining high classification accuracy even at demanding recall thresholds depicted in
Figure 14. At 90% recall, the proposed model achieved 0.85 precision, significantly outperforming baseline methods that suffer substantial precision degradation. This consistent performance across recall levels highlights the robustness of model in negative sentiment identification while minimizing false positives. At the same time, AUC evaluation revealed the exceptional discriminative power of ANSEC-MM with 0.94 AUROC and 0.92 PR-AUC, substantially exceeding all comparative models illustrated in
Figure 15. This superior performance across both ROC and precision-recall spaces indicates robust capability in distinguishing negative sentiments across different operating conditions. The significant margin in PR-AUC particularly demonstrates enhanced performance on the imbalanced classification task characteristic of social media sentiment analysis. These results validate the reliability of the model for practical deployment in negative sentiment detection systems.
As shown in
Figure 16, the proposed model achieved an optimal balance between sensitivity and specificity, maintaining FPR below 0.12 across all recall thresholds while baseline models exhibit substantial FPR degradation at higher recall levels. Again, the precision-recall trade-off analysis in
Figure 17 highlights consistent performance of ANSEC-MM, maintaining precision above 0.85 even at 90% recall, demonstrating robust negative sentiment classification accuracy.
4.4. Mitigating the Likelihood of Negative Sentiments and Ethical Consideration
The model compares the negative and positive sentiments in a specific time period and on the basis of expression capacity it predicts and controls the propagation of negative sentiments, thereby focusing on maximizing positive or desired opinions. Considering
Figure 18, lowering the generation of negative sentiments at influencer node 55, the EANN, BERT and AOAN could only prevent the spread out to 60, 50 and 60% of the negative sentiments whereas the proposed model ANSEC-MM discarded nearly 88% of the negative opinions as defined by the evaluation protocol in
Section 4.1. This could only be possible because ANSEC-MM bolstered the mainstream network credibility and monitored the likelihood during the dissipation phase. The expression capacity accounts for the likelihood of influencer nodes that become inactive in spreading opinions. This exhibits the dynamic behavior of influencer nodes and considers factors such as user personality and event to better consider and mitigate the spreading of unusual behavior of the users within the network. The performance of BERT and AOAN from node 105 to 205 almost remained synchronized because BERT tried to strengthen the monitoring and promoted evidence-based judgment to rebuild the trust level while AOAN restricted the negative content through platform rules to reduce emotional spread. At 255 influencer nodes, the proposed model was still active to mitigate the negative sentiments up to 30% whereas rest of the approaches were residing below 10%.