MDPI - Publisher of Open Access Journals

26 pages, 2041 KB

Open AccessArticle

Digital Information Cascades and Sustainable Visitor Flow Management: Evidence from GPS Trajectories and Social Media During an Urban Festival

by Yundi Wang and Zhibin Xing

Sustainability 2026, 18(10), 4952; https://doi.org/10.3390/su18104952 - 14 May 2026

Abstract

Urban festivals attract substantial numbers of tourists, which consequently imposes significant strain on host cities through spatial overcrowding, uneven pressure on infrastructure, and diminished quality of the visitor experience. Destination management organizations (DMOs) require effective tools to redistribute tourist flows; however, the influence [...] Read more.

Urban festivals attract substantial numbers of tourists, which consequently imposes significant strain on host cities through spatial overcrowding, uneven pressure on infrastructure, and diminished quality of the visitor experience. Destination management organizations (DMOs) require effective tools to redistribute tourist flows; however, the influence of social media on tourists’ actual destination choices remains insufficiently understood. We ask whether social media discussion intensity (“buzz”) causally influences tourists’ destination choices and whether the effect grows stronger during festivals when information asymmetry is at its peak. Combining 95,692 taxi GPS trajectories with 5995 geotagged Twitter records from the 2019 Songkran Festival in Bangkok, we constructed an exponentially weighted moving average (EWMA) buzz variable with a temporal lag that establishes causal ordering. A conditional logit model shows that district-level buzz significantly raises destination choice probability and that the effect is amplified during the festival. Causal identification rests on a triangulated strategy that combines temporal lag, placebo permutation, and Bartik shift-share instrumental variables. The festival-period IV-corrected estimate (

{\hat{β}}^{IV} = + 0.019

,

p < 0.001

) is 51% larger than the within-period OLS estimate (

{\hat{β}}^{OLS} = + 0.012

,

p < 0.001

), a gap consistent with classical measurement-error attenuation in sparse social-media data, and a panel 2SLS analysis at the district–day level isolates a causal visitation channel confirming that cascades reinforce spatial concentration at the tourist-flow level. The aggregate Gini coefficient of spatial concentration declines over the study window in a statistically significant monotonic trend. The positive district-level correlation between buzz and congestion does not survive district and date fixed effects, which indicates that it reflects underlying differences in attractiveness across districts rather than a direct within-district channel. These findings provide an empirical foundation for information-based visitor flow management by identifying the underlying behavioral mechanism rather than evaluating a designed intervention. Full article

(This article belongs to the Special Issue Sustainability in the Hospitality and Tourism Industry in the Age of Digitization)

19 pages, 4841 KB

Open AccessArticle

Mining Patient Narratives to Analyze Lifestyle–Blood Glucose Relationships: An LLM-Based Text Mining Framework

by Kazuyuki Matsumoto, Minoru Yoshida and Chikaho Karino

J 2026, 9(2), 14; https://doi.org/10.3390/j9020014 - 13 May 2026

Viewed by 73

Abstract

Lifestyle-related diseases such as diabetes are closely influenced by daily habits, yet the complex interactions between lifestyle factors and blood glucose variation remain insufficiently quantified. This study proposes a natural language processing (NLP) framework that analyzes long-form illness blogs to identify lifestyle factors [...] Read more.

Lifestyle-related diseases such as diabetes are closely influenced by daily habits, yet the complex interactions between lifestyle factors and blood glucose variation remain insufficiently quantified. This study proposes a natural language processing (NLP) framework that analyzes long-form illness blogs to identify lifestyle factors associated with elevated blood glucose levels. Diabetes-related narratives were collected from a Japanese illness blog portal (TOBYO) and processed through GPT-4o-based automated labeling, BERT-series contextual embeddings, and LightGBM classification. For Type 2 Diabetes classification, the model achieved an F1-score of 0.73 using JMedRoBERTa embeddings, outperforming baseline models (BERT = 0.70; Twitter-RoBERTa = 0.65). Key factors contributing to glucose elevation were identified through feature importance analysis, with dietary behavior, lack of exercise, poor sleep, and stress emerging as major contributors. These findings demonstrate the potential of combining large language models with structured machine learning to extract health-relevant knowledge from patient narratives. The proposed approach contributes to preventive healthcare by offering interpretable, data-driven insights into lifestyle–glycemic relationships, and provides a foundation for personalized diabetes risk monitoring and AI-based health management applications. Full article

(This article belongs to the Section Computer Science & Mathematics)

► Show Figures

Figure 1

20 pages, 1548 KB

Open AccessReview

Twenty-Five Years of Sentiment Analysis in Urban Environments: Thematic Trends and Future Perspectives

by Iuria Betco, Cláudia M. Viana, Eduardo Gomes, Jorge Rocha and Diogo Gaspar Silva

Urban Sci. 2026, 10(5), 265; https://doi.org/10.3390/urbansci10050265 - 12 May 2026

Viewed by 254

Abstract

This paper offers a comprehensive overview of academic research on sentiment analysis in urban built environments from 2000 to 2025. Based on data from the scientific database Scopus and drawing on bibliometric tools like Bibliometrix (R) and VOSviewer for performance analysis and scientific [...] Read more.

This paper offers a comprehensive overview of academic research on sentiment analysis in urban built environments from 2000 to 2025. Based on data from the scientific database Scopus and drawing on bibliometric tools like Bibliometrix (R) and VOSviewer for performance analysis and scientific mapping, it identifies publication trends, key influential works, leading authors and institutions, funding sources, and thematic clusters. The final dataset comprises 1315 English-language documents authored by 3855 researchers across 160 sources, with a total of 14,058 citations worldwide. The academic production increased after 2009, peaking in 2025. Keyword and network analyses highlight central themes (and methodological approaches) to the study of sentiment analysis in urban built environments. These include social media platforms like Twitter/X, machine learning, smart cities, artificial intelligence, mental health, and urban planning. China, the USA, and India lead in publication output. Over the last twenty-five years, key publication outlets included Sustainability (Switzerland), Cities, and the International Journal of Environmental Research and Public Health, while the National Natural Science Foundation of China has been the main funder. The paper discusses how sentiment analysis can support urban planning and public health by linking environmental features to well-being and explores emerging methodological trends like deep learning, multimodal approaches, and context-aware models. Overall, it maps the field’s intellectual landscape and argues in future directions for human-centered, data-driven urban decision-making. Full article

► Show Figures

Figure 1

20 pages, 1907 KB

Open AccessCommunication

Quantifying the Oral Cancer Public Awareness Deficit in Germany (2015–2023)

by Babak Saravi, Michael Vollmer, Daman Deep Singh, Lara Schorn, Julian Lommen, Felix Schrader, Max Wilkat, Andreas Vollmer, Veronika Shavlokhova, Marius Hörner, Norbert Kübler and Christoph Sproll

Cancers 2026, 18(8), 1236; https://doi.org/10.3390/cancers18081236 - 14 Apr 2026

Viewed by 507

Abstract

Objective: To quantify the gap between oral cancer disease burden and public awareness in Germany, and to characterize research dissemination patterns across social media platforms. Methods: We conducted a multi-dimensional analysis integrating: (1) Robert Koch Institut cancer registry data for oral and maxillofacial [...] Read more.

Objective: To quantify the gap between oral cancer disease burden and public awareness in Germany, and to characterize research dissemination patterns across social media platforms. Methods: We conducted a multi-dimensional analysis integrating: (1) Robert Koch Institut cancer registry data for oral and maxillofacial malignancies (ICD-10: C00–C06) from 2015 to 2023; (2) Google Trends search interest for cancer-related German terms; (3) Altmetric data for 2581 PubMed-indexed oral cancer publications; and (4) sentiment analysis of 10,308 social media posts. Age-standardized incidence rates were calculated using the European Standard Population. Results: Over the study period, 65,757 oral cavity cancer cases were registered. Google Trends analysis revealed a 64% attention deficit for “Mundkrebs” (oral cancer; mean: 17) compared to “Brustkrebs” (breast cancer; mean: 47). Case numbers declined from 7577 (2019) to 6870 (2023; −9.3%), while age-standardized rates decreased by 15.5% (11.6 to 9.8 per 100,000), with males disproportionately affected (−17.7%). Research dissemination was dominated by X/Twitter (86.2%), with minimal policy document (0.3%) or clinical guideline (0.3%) citations. Sentiment analysis revealed 77% positive public reception. Regional analysis identified an East–West divide, with Eastern German states showing 22% higher search interest. Conclusions: A substantial public awareness deficit exists for oral cancer in Germany, paradoxically widening during a period of declining diagnoses potentially associated with COVID-19-related diagnostic delays. The positive public sentiment toward oral cancer research suggests a favorable environment for targeted awareness campaigns, particularly in Western German states where search interest is lowest. These findings have practical implications for designing regionally tailored awareness campaigns prioritizing anatomically specific terminology. Future research should evaluate the effectiveness of such targeted interventions and assess whether post-pandemic diagnoses present at more advanced stages. Full article

(This article belongs to the Special Issue Oral Cancer and Precancerous Lesions: Advances in Diagnosis, Prognosis, and Therapy)

► Show Figures

Figure 1

26 pages, 565 KB

Open AccessArticle

Multi-Strategy Improvement and Comparative Research on Data-Driven Social Network Construction in Edge-Deficient Scenarios for Social Bot Account Detection

by Junjie Wang and Minghu Tang

Information 2026, 17(4), 360; https://doi.org/10.3390/info17040360 - 9 Apr 2026

Viewed by 344

Abstract

Accurate social bot detection relies on simulated data to alleviate the scarcity of labeled real-world datasets. Synthetic graph data serves as the core training resource for detection models within simulated data; nevertheless, edge deficiency in real social networks (induced by privacy constraints and [...] Read more.

Accurate social bot detection relies on simulated data to alleviate the scarcity of labeled real-world datasets. Synthetic graph data serves as the core training resource for detection models within simulated data; nevertheless, edge deficiency in real social networks (induced by privacy constraints and data collection limitations) gives rise to “pseudo-isolated nodes” and distorts the quality of synthetic graph data. Furthermore, mainstream data-driven synthetic graph generation methods lack systematic and credible comparative analyses. To tackle these problems, this study optimizes two representative synthetic graph generation approaches (the Chung-Lu model and the Random Classifier-based Multi-Hop (RCMH) sampling + diffusion model) and puts forward an edge completion strategy grounded in sociological theories. Multiple groups of comparative experiments are conducted to assess the performance of the improved methods and the edge completion strategy. Experimental results demonstrate that the “interest + social association” edge completion strategy achieves an F1-score (F1) of 0.7051, and the improved sampling + diffusion model integrated with edge completion reaches an F1-score of 0.7071, which performs better than traditional and unmodified methods to a certain extent. This work preliminarily enhances the reliability of synthetic graph generation methods and provides relatively high-quality synthetic social graph data for social bot detection. It should be noted that the proposed methods are validated solely on Twitter-derived datasets, and their effectiveness remains to be verified in cross-platform adaptation and dynamic social network scenarios. Full article

(This article belongs to the Section Information Security and Privacy)

► Show Figures

Figure 1

19 pages, 3130 KB

Open AccessArticle

SGMLN: Sentiment-Guided Mutual Learning Network for Multimodal Sarcasm Detection

by Yiran Wang, Xin Zhao and Yongtang Bao

Sensors 2026, 26(8), 2304; https://doi.org/10.3390/s26082304 - 8 Apr 2026

Viewed by 390

Abstract

Social networks such as Twitter have grown rapidly and are now flooded with sarcastic comments, both in text and in images. Detecting sarcasm in multimodal data has significant social value and is attracting increasing research attention. However, most studies overlook the role of [...] Read more.

Social networks such as Twitter have grown rapidly and are now flooded with sarcastic comments, both in text and in images. Detecting sarcasm in multimodal data has significant social value and is attracting increasing research attention. However, most studies overlook the role of sentiment, even though sentiment information in text is closely linked to clues of sarcasm. Additionally, few consider how text and images align semantically. To address these issues, we propose a sentiment-guided mutual learning network (SGMLN) for multimodal sarcasm detection. SGMLN utilizes sentiment information to inform the combination of text and image features, and employs mutual learning to facilitate knowledge sharing among classifiers. We design a sentiment-guided attention layer that injects sentiment into both modalities, producing features that capture sarcasm more effectively. Sentic-BERT extracts sentiment-aware vectors from text, using word-level sentiment as a mask. In mutual learning, a logistic distribution function measures differences between classifiers, improving knowledge transfer between modalities. This step boosts multimodal understanding and model performance. By introducing sentiment-aware representations and semantic alignment, SGMLN bridges the gap between text and images, making them more consistent. Experiments on public datasets demonstrate that our model is effective and outperforms alternatives. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

22 pages, 1170 KB

Open AccessArticle

Adverse Drug Reaction Detection on Social Media Based on Large Language Models

by Hao Li and Hongfei Lin

Information 2026, 17(4), 352; https://doi.org/10.3390/info17040352 - 7 Apr 2026

Viewed by 524

Abstract

Adverse drug reaction (ADR) detection is essential for ensuring drug safety and effective pharmacovigilance. The rapid growth of users’ medication reviews posted on social media has introduced a valuable new data source for ADR detection. However, the large scale and high noise inherent [...] Read more.

Adverse drug reaction (ADR) detection is essential for ensuring drug safety and effective pharmacovigilance. The rapid growth of users’ medication reviews posted on social media has introduced a valuable new data source for ADR detection. However, the large scale and high noise inherent in social media text pose substantial challenges to existing detection methods. Although large language models (LLMs) exhibit strong robustness to noisy and interfering information, they are often limited by issues such as stochastic outputs and hallucinations. To address these challenges, this paper proposes two generative detection frameworks based on Chain of Thought (CoT), namely LLaMA-DetectionADR for Supervised Fine-Tuning (SFT) and DetectionADRGPT for low-resource in-context learning. LLaMA-DetectionADR automatically generates CoT reasoning sequences to construct an instruction tuning dataset, which is then used to fine-tune the LLaMA3-8B model via Quantized Low-Rank Adaptation (QLoRA). In contrast, DetectionADRGPT leverages clustering algorithms to select representative unlabeled samples and enhances in-context learning by incorporating CoT reasoning paths together with their corresponding labels. Experimental results on the Twitter and CADEC social media datasets show that LLaMA-DetectionADR achieves excellent performance, with F1 scores of 92.67% and 86.13%, respectively. Meanwhile, DetectionADRGPT obtains competitive F1 scores of 87.29% and 82.80% with only a few labeled examples, approaching the performance of fully supervised advanced models. The overall results demonstrate the effectiveness and practical value of the proposed CoT-based generative frameworks for ADR detection from social media. Full article

(This article belongs to the Topic Generative AI and Interdisciplinary Applications)

► Show Figures

Figure 1

35 pages, 2740 KB

Open AccessArticle

Prediction of Depression Risk on Social Media Using Natural Language Processing and Explainable Machine Learning

by Ronewa Mabodi, Elliot Mbunge, Tebogo Makaba and Nompumelelo Ndlovu

Appl. Sci. 2026, 16(7), 3489; https://doi.org/10.3390/app16073489 - 3 Apr 2026

Viewed by 558

Abstract

Major Depressive Disorder (MDD) is a significant global health burden that contributes to disability and reduced quality of life. Its impact extends beyond individuals, placing emotional, social, and economic strain on families and healthcare systems worldwide. Despite its prevalence, MDD remains widely misunderstood, [...] Read more.

Major Depressive Disorder (MDD) is a significant global health burden that contributes to disability and reduced quality of life. Its impact extends beyond individuals, placing emotional, social, and economic strain on families and healthcare systems worldwide. Despite its prevalence, MDD remains widely misunderstood, with limited mental health literacy and persistent stigma often preventing individuals from seeking help. This research explored the prediction of MDD utilising social media data via Natural Language Processing (NLP), Machine Learning (ML), and explainable Machine Learning (xML) techniques. The research aimed at identifying depressive indicators on X (formerly Twitter) and developing interpretable models for depression risk detection. The study’s methodology followed the Cross-Industry Standard Process for Data Mining (CRISP-DM) framework to ensure a systematic approach to data analysis. Data was collected via X’s API and processed using regex-based noise removal, normalisation, tokenisation, and lemmatisation. Symptoms were mapped to DSM-5-TR criteria at the post-level, with user-level MDD risk assessed based on symptom persistence over a two-week period. Risk levels were classified as No Risk, Monitor, and High Risk to facilitate early intervention. Six ML models were trained and tested, while the Synthetic Minority Over-sampling Technique (SMOTE) was applied to mitigate class imbalance. The dataset was partitioned into training and testing sets using an 80:20 split. ML models were evaluated, and the Extreme Gradient Boosting model outperformed the others. Extreme Gradient Boosting achieved an accuracy of 0.979, F1-score of 0.970, and ROC-AUC of 0.996, surpassing benchmark results reported in prior studies. Explainability techniques, such as LIME and tree-based feature importance, enhance model transparency and clinical interpretability. Depressed mood consistently emerged as the highest-weighted predictor across different models. The findings highlight the value of aligning ML models with validated diagnostic frameworks to improve trustworthiness and reduce false positives. Future research can expand beyond text-based analysis by incorporating multimodal features to broaden diagnostic depth. Full article

(This article belongs to the Special Issue Deep Learning and Machine Learning in Information Systems)

► Show Figures

Figure 1

25 pages, 2031 KB

Open AccessArticle

A Hybrid Machine Learning Approach for Classifying Indonesian Cybercrime Discourse Using a Localized Threat Taxonomy

by Firman Arifman, Teddy Mantoro and Dini Oktarina Dwi Handayani

Information 2026, 17(3), 301; https://doi.org/10.3390/info17030301 - 20 Mar 2026

Viewed by 547

Abstract

Indonesia’s rapid digital growth has been accompanied by escalating cyber threats, with public discourse on social media emerging as a critical but underutilized source of threat intelligence. This discourse is characterized by informal language and local nuances that render existing international cybercrime taxonomies [...] Read more.

Indonesia’s rapid digital growth has been accompanied by escalating cyber threats, with public discourse on social media emerging as a critical but underutilized source of threat intelligence. This discourse is characterized by informal language and local nuances that render existing international cybercrime taxonomies ineffective, creating a gap in scalable, locally relevant threat analytics. This study introduces the Indonesian Cybercrime Threat Taxonomy (ICTT), a novel five-dimensional framework tailored to Indonesian online environments. An end-to-end OSINT pipeline was developed to collect 2344 samples from X (formerly Twitter) and YouTube, employing weak supervision with 12 high-precision regex patterns to generate training labels. A state-of-the-art IndoBERT model was fine-tuned on this data, and its performance was compared against rule-based and hybrid classification models. On a manually annotated gold-standard dataset of 600 samples, both the IndoBERT and hybrid models achieved 96.8% accuracy, significantly outperforming the rule-based baseline (66.7%). The models demonstrated strong generalization across both social media platforms, and the hybrid approach provided an effective balance of high performance and interpretability. This research demonstrates that informal public discourse can be systematically transformed into structured threat intelligence. The ICTT and the accompanying hybrid classification system provide a scalable, interpretable, and locally relevant foundation for cyber threat analytics in Indonesia, establishing a methodological blueprint for other low-resource language contexts. Full article

(This article belongs to the Special Issue Information Extraction and Language Discourse Processing)

► Show Figures

Figure 1

18 pages, 10514 KB

Open AccessArticle

Digital Ethnography of Ethnic Cohesion: Social Media Narratives During a National Disaster in Sri Lanka

by G. H. B. A. de Silva and H. A. K. Sumedha

Soc. Sci. 2026, 15(3), 195; https://doi.org/10.3390/socsci15030195 - 18 Mar 2026

Viewed by 750

Abstract

Social media platforms have become central infrastructures for disaster communication, yet their role in shaping ethnic cohesion in post-conflict societies remains insufficiently examined. Sri Lanka, marked by a legacy of ethnic conflict, provides a critical context for exploring how moments of crisis are [...] Read more.

Social media platforms have become central infrastructures for disaster communication, yet their role in shaping ethnic cohesion in post-conflict societies remains insufficiently examined. Sri Lanka, marked by a legacy of ethnic conflict, provides a critical context for exploring how moments of crisis are narratively and symbolically negotiated online. This study employs a qualitative digital ethnographic approach to analyze publicly accessible social media content circulated during a recent national disaster. Data were collected from Facebook, X (formerly Twitter), and TikTok between 1 and 10 December, yielding an initial corpus of 344 posts, of which 200 were purposively selected for in-depth analysis following the removal of duplicated and near-identical content. Reflexive thematic analysis identified three dominant and interrelated narrative patterns: expressions of solidarity, resource sharing and mutual aid, and visual–symbolic representations of unity. These narratives were articulated through inclusive language, unity-oriented hashtags, depictions of material assistance, and imagery emphasizing co-presence across religious and institutional lines. Engagement metrics were examined as indicators of narrative resonance within platform visibility structures. The findings suggest that social media temporarily foregrounded discursive cohesion and symbolic unity during the disaster period. However, these representations should be interpreted as context-specific and performative rather than as evidence of durable inter-ethnic integration. This study contributes by demonstrating how social media platforms operate as spaces for the performative articulation of ethnic unity during disasters in post-conflict contexts, using a digital ethnographic approach to methodologically and empirically research digital ethnography, disaster communication, and social cohesion in post-conflict settings. Full article

► Show Figures

Figure 1

17 pages, 715 KB

Open AccessArticle

Social Media and Macroeconomic Factors as Drivers of Innovation: Evidence from Africa

by Emmanuel Olatunbosun Benjamin and Oreoluwa Ola

Youth 2026, 6(1), 30; https://doi.org/10.3390/youth6010030 - 5 Mar 2026

Viewed by 1171

Abstract

Africa’s expanding youth population and rapid digitalization present opportunities for innovation and, ultimately, entrepreneurship and economic growth relevant for Sustainable Development Goal (SDG) 8—Decent Work and Economic Growth. However, the role of social media in shaping these outcomes remains underexplored empirically. This study [...] Read more.

Africa’s expanding youth population and rapid digitalization present opportunities for innovation and, ultimately, entrepreneurship and economic growth relevant for Sustainable Development Goal (SDG) 8—Decent Work and Economic Growth. However, the role of social media in shaping these outcomes remains underexplored empirically. This study examines how platform-specific social media use influences innovation, operationalized through external search breadth and depth, while considering macroeconomic moderators. Using panel data from 52 African countries from 2009 to 2022 and fixed effects regressions, the study links activities on Facebook, X (formerly Twitter), YouTube, LinkedIn, and Google to innovation indicators such as R&D expenditure, patent applications, and scientific publications. The findings suggest that YouTube use is consistently and positively associated with all innovation indicators, highlighting its role in knowledge diffusion and creative expression. By contrast, X and LinkedIn display neutral or negative effects. High internet penetration alone is not sufficient enough to spur innovation, underscoring the need for enabling macroeconomics factors such as GDP per capita and ease of doing business. This study concludes that visual open-access platforms, supported by education and institutional capacity, are vital for inclusive and sustained economic growth. Full article

(This article belongs to the Special Issue Navigating the Hybrid Media Landscape: Youth Identity, Behaviour and Beliefs)

► Show Figures

Figure 1

27 pages, 434 KB

Open AccessArticle

Stakeholder Engagement on Social Media and Firm Performance: Evidence from Multi-Platform Digital Interactions

by Berto Usman, Abdurrachman Bakrie, Ridwan Nurazi, Intan Zoraya and Somnuk Aujirapongpan

J. Risk Financial Manag. 2026, 19(2), 107; https://doi.org/10.3390/jrfm19020107 - 3 Feb 2026

Viewed by 1351

Abstract

This study examines the influence of stakeholder engagement with corporate social responsibility (CSR) disclosures on social media and corporate financial performance, grounded in legitimacy theory and stakeholder theory. Using a panel dataset of 388 firm-year observations of Indonesian listed companies over the period [...] Read more.

This study examines the influence of stakeholder engagement with corporate social responsibility (CSR) disclosures on social media and corporate financial performance, grounded in legitimacy theory and stakeholder theory. Using a panel dataset of 388 firm-year observations of Indonesian listed companies over the period 2019–2022, we investigate how stakeholder interactions across four social media platforms—Facebook, Twitter, Instagram, and YouTube—relate to firm performance measured by Return on Assets (ROA) and Return on Equity (ROE). Panel data regression results reveal that stakeholder engagement on visual-based platforms plays a significant role in enhancing financial performance. In particular, Instagram likes and YouTube likes are positively associated with ROA (β = 0.0004, p < 0.05; β = 0.0002, p < 0.05), while Instagram comments, YouTube likes, and YouTube views show a significant positive relationship with ROE (β = 0.011, p < 0.01; β = 0.0006, p < 0.01; β = 0.000249, p < 0.01). In contrast, engagement metrics on Facebook and Twitter do not exhibit a statistically significant association with firm performance. These findings suggest that stakeholder engagement with CSR disclosures through high-engagement, visual-oriented social media platforms can strengthen corporate legitimacy and stakeholder relationships, ultimately contributing to improved financial outcomes. The study highlights the strategic importance of platform-specific digital communication in enhancing firm performance. Full article

(This article belongs to the Section Business and Entrepreneurship)

► Show Figures

Figure 1

24 pages, 4461 KB

Open AccessArticle

SD-CVD Corpus: Towards Robust Detection of Fine-Grained Cyber-Violence Across Saudi Dialects in Online Platforms

by Abrar Alsayed, Salma Elhag and Sahar Badri

Information 2026, 17(1), 76; https://doi.org/10.3390/info17010076 - 12 Jan 2026

Viewed by 661

Abstract

This paper introduces Saudi Dialects Cyber Violence Detection (SD-CVD) corpus, a large-scale, class-balanced Saudi-dialect corpus for fine-grained cyber violence detection on online platforms. The dataset contains 88,687 Saudi Arabic tweets annotated using a three-level hierarchical scheme that assigns each tweet to one of [...] Read more.

This paper introduces Saudi Dialects Cyber Violence Detection (SD-CVD) corpus, a large-scale, class-balanced Saudi-dialect corpus for fine-grained cyber violence detection on online platforms. The dataset contains 88,687 Saudi Arabic tweets annotated using a three-level hierarchical scheme that assigns each tweet to one of 11 mutually exclusive classes, covering benign sentiment (positive, neutral, negative), cyberbullying, and seven hate-speech subtypes (incitement to violence, gender, national, social class, tribal, religious, and regional discrimination). To mitigate the class imbalance common in Arabic cyber violence datasets, data augmentation was applied to achieve a near-uniform class distribution. Annotation quality was ensured through multi-stage review, yielding excellent inter-annotator agreement (Fleiss’ κ > 0.89). We evaluate three modeling paradigms: traditional machine learning with TF–IDF and n-gram features (SVM, logistic regression, random forest), deep learning models trained on fixed sentence embeddings (LSTM, RNN, MLP, CNN), and fine-tuned transformer models (AraBERTv02-Twitter, CAMeLBERT-MSA). Experimental results show that transformers perform best, with AraBERTv02-Twitter achieving the highest weighted F1-score (0.882) followed by CAMeLBERT-MSA (0.869). Among non-transformer baselines, SVM is most competitive (0.853), while CNN performs worst (0.561). Overall, SD-CVD provides a high-quality benchmark and strong baselines to support future research on robust and interpretable Arabic cyber-violence detection. Full article

► Show Figures

Figure 1

20 pages, 945 KB

Open AccessArticle

A Pilot Study on Multilingual Detection of Irregular Migration Discourse on X and Telegram Using Transformer-Based Models

by Dimitrios Taranis, Gerasimos Razis and Ioannis Anagnostopoulos

Electronics 2026, 15(2), 281; https://doi.org/10.3390/electronics15020281 - 8 Jan 2026

Cited by 1 | Viewed by 890

Abstract

The rise of Online Social Networks has reshaped global discourse, enabling real-time conversations on complex issues such as irregular migration. Yet the informal, multilingual, and often noisy nature of content on platforms like X (formerly Twitter) and Telegram presents significant challenges for reliable [...] Read more.

The rise of Online Social Networks has reshaped global discourse, enabling real-time conversations on complex issues such as irregular migration. Yet the informal, multilingual, and often noisy nature of content on platforms like X (formerly Twitter) and Telegram presents significant challenges for reliable automated analysis. This study presents an exploratory multilingual natural language processing (NLP) framework for detecting irregular migration discourse across five languages. Conceived as a pilot study addressing extreme data scarcity in sensitive migration contexts, this work evaluates transformer-based models on a curated multilingual corpus. It provides an initial baseline for monitoring informal migration narratives on X and Telegram. We evaluate a broad range of approaches, including traditional machine learning classifiers, SetFit sentence-embedding models, fine-tuned multilingual BERT (mBERT) transformers, and a Large Language Model (GPT-4o). The results show that GPT-4o achieves the highest performance overall (F1-score: 0.84), with scores reaching 0.89 in French and 0.88 in Greek. While mBERT excels in English, SetFit outperforms mBERT in low-resource settings, specifically in Arabic (0.79 vs. 0.70) and Greek (0.88 vs. 0.81). The findings highlight the effectiveness of transformer-based and large-language-model approaches, particularly in low-resource or linguistically heterogeneous environments. Overall, the proposed framework provides an initial, compact benchmark for multilingual detection of irregular migration discourse under extreme, low-resource conditions. The results should be viewed as exploratory indicators of model behavior on this synthetic, small-scale corpus, not as statistically generalizable evidence or deployment-ready tools. In this context, “multilingual” refers to robustness across different linguistic realizations of identical migration narratives under translation, rather than coverage of organically diverse multilingual public discourse. Full article

(This article belongs to the Special Issue Artificial Intelligence-Driven Emerging Applications)

► Show Figures

Figure 1

17 pages, 606 KB

Open AccessArticle

Emotional Digital Storytelling as a Driver of Social Media Engagement in Higher Education: A Multi-Platform Analysis

by José Carlos Losada Díaz and Javier Almela-Baeza

Information 2026, 17(1), 30; https://doi.org/10.3390/info17010030 - 1 Jan 2026

Viewed by 1969

Abstract

Digital storytelling has become a central component of emerging communication strategies, particularly in competitive higher-education environments where audience attention and engagement are increasingly mediated by social platforms. This study evaluates the impact of an emotional storytelling format—Historia(s) de Universidad (HdU)—implemented by the University [...] Read more.

Digital storytelling has become a central component of emerging communication strategies, particularly in competitive higher-education environments where audience attention and engagement are increasingly mediated by social platforms. This study evaluates the impact of an emotional storytelling format—Historia(s) de Universidad (HdU)—implemented by the University of Murcia (UMU), comparing its performance with traditional institutional content across Instagram, TikTok, X (Twitter), and LinkedIn. A dataset of 6096 posts (September 2020–September 2023) and 25,636 audiovisual items was analysed using descriptive metrics, negative binomial and quasi-binomial regression models, and a differences-in-differences (DiD) design aligned with the formal launch of HdU in September 2022. The results indicate that emotionally driven storytelling posts consistently outperform institutional content in terms of visibility and interaction: HdU posts nearly double the engagement rate (OR ≈ 2.0) and increase interactions by 80% (RR ≈ 1.8; p < 0.001). The DiD analysis indicating a variation associated with the implementation of HdU with no pre-existing trends. Findings demonstrate that emotional narrative formats constitute an effective strategic tool for digital communication management, reinforcing institutional identity, enhancing stakeholder relationships, and contributing to reputation-building in higher education. The study highlights implications for the design of narrative-driven digital communication and offers pathways for future research combining quantitative performance metrics with qualitative audience insights. The study discusses practical implications for crafting narrative-driven communication strategies and identifies avenues for future research, such as combining quantitative performance data with qualitative audience insights to deepen understanding of storytelling’s impact in university contexts. Full article

(This article belongs to the Special Issue Social Media Mining: Algorithms, Insights, and Applications)

► Show Figures

Figure 1

Search Results (740)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (740)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI