MDPI - Publisher of Open Access Journals

20 pages, 13974 KB

Open AccessArticle

A Perceptual Rate Control Algorithm Based on JND for Screen Content Video

by Huijie Zheng, Jing Chen and Qi Lin

Sensors 2026, 26(12), 3866; https://doi.org/10.3390/s26123866 - 17 Jun 2026

Viewed by 25

The rate control algorithm is designed for natural video by default in video-coding standards. However, computer-generated screen content video (SCV) is very different from natural video captured by a camera, with many different statistical characteristics, such as sharp edges, thin lines, and flat [...] Read more.

The rate control algorithm is designed for natural video by default in video-coding standards. However, computer-generated screen content video (SCV) is very different from natural video captured by a camera, with many different statistical characteristics, such as sharp edges, thin lines, and flat area. This will lead to a difference in the focus of the human visual system (HVS) when viewing on-screen content video. Especially in various sensor data visualization applications such as intelligent display terminals, industrial monitoring and human–computer interaction interfaces, screen content video carries key information collected and reconstructed by image sensors, vision sensors and multimodal sensors. Its edge structures and local details directly affect the interpretation accuracy and application reliability of sensor information. Therefore, it is crucial to investigate perceptual rate control methods that integrate both video content characteristics and human visual perception properties, which possesses substantial theoretical and practical significance. In this paper, we propose a perceptual rate control algorithm for screen content video based on just-noticeable distortion (JND) which is established on the edge profile reconstruction with tolerable variations. First of all, target bit rate allocation for the frame level and CTU level is based on a perceptual weight which is calculated on the JND factor and reconstruction edge character. Secondly, under the constraint of the JND model, an intra rate-distortion (RD) model is established under the constraint of the JND model. The similarity between reference frames and reconstructed frames is taken as feedback in this model. Finally, the proposed rate control algorithm (JND–perceptual rate control (PRC)) is integrated to the existing rate control framework in High-Efficiency Video Coding–Screen Content Coding (HEVC-SCC) for improving the coding efficiency. The experimental results show that the proposed algorithm achieves better bit control precision than the platform, as well as improves the R-D performance of screen content video. In particular, compared with the HEVC-SCC reference software, the coding performance is improved by 3.09 dB on average, the bit rate is saved by 26.51% on average, and the average bit rate mismatch is within 1.159%. Full article

(This article belongs to the Special Issue Intelligent Sensing Technology for Image and Video Processing)

► Show Figures

Figure 1

26 pages, 997 KB

Open AccessArticle

Zero-Shot Multimodal Sentiment Analysis Using LVLMs as a Triage Signal for Video Platform Moderation

by Anggi Hanafiah, Winda Monika, Arbi Haza Nasution, Aytuğ Onan, Yohei Murakami and Hafiza Oktasia Nasution

Digital 2026, 6(2), 40; https://doi.org/10.3390/digital6020040 - 16 May 2026

Viewed by 386

Abstract

Children increasingly consume online video content, creating a growing need for scalable approaches to support content moderation workflows. However, directly identifying harmful or policy-violating content, such as violence, sexual content, or self-harm, remains a complex task that typically requires specialized classifiers and domain-specific [...] Read more.

Children increasingly consume online video content, creating a growing need for scalable approaches to support content moderation workflows. However, directly identifying harmful or policy-violating content, such as violence, sexual content, or self-harm, remains a complex task that typically requires specialized classifiers and domain-specific annotations. In this context, sentiment analysis can provide complementary information by capturing affective signals expressed through language and visual cues. This study does not treat sentiment polarity as a direct indicator of unsafe or policy-violating content. Instead, it explores multimodal sentiment analysis as an auxiliary triage signal that may help prioritize content for human review or identify segments requiring further inspection. This paper investigates the feasibility of using large vision–language models (LVLMs) for zero-shot multimodal sentiment analysis on utterance-aligned video segments. We evaluate two LVLMs, LLaVA-OneVision-7B and Qwen2.5-VL-7B, under three input settings: text-only, vision-only, and multimodal, using a conversational TV-series dataset consisting of short utterance-level video segments and transcripts. The results show that multimodal sentiment inference can provide useful screening signals without task-specific fine-tuning, although the benefits are model-dependent. LLaVA-OneVision-7B consistently outperforms Qwen2.5-VL-7B and benefits more clearly from combining textual and visual inputs, whereas Qwen2.5-VL-7B shows limited improvement across modality settings. We also analyze the trade-off between frame sampling and image resolution. Finally, we discuss limitations related to dataset scope, annotation subjectivity, class imbalance, and the need for broader validation before real-world deployment. Full article

► Show Figures

Figure 1

12 pages, 461 KB

Open AccessArticle

Dietary Management After Ulcerative Colitis Surgery: A Thematic Analysis of TikTok Content

by Oliver R. Kaye, Dakota R. Rhys-Jones, Orestis Argyriou, Sue Blackwell, Emma P. Halmos, Zaid Ardalan, Janindra Warusavitarne, Kapil Sahnan, Jonathan P. Segal, Ailsa L. Hart, Chu K. Yao and Itai Ghersin

Nutrients 2026, 18(7), 1110; https://doi.org/10.3390/nu18071110 - 30 Mar 2026

Viewed by 795

Abstract

Background/Objectives: For patients with Ulcerative Colitis (UC) requiring surgical treatment, post-operative dietary management can pose significant challenges. TikTok is emerging as a popular social media platform for dissemination of health and nutrition information. The aim of this study is to analyse patient-generated [...] Read more.

Background/Objectives: For patients with Ulcerative Colitis (UC) requiring surgical treatment, post-operative dietary management can pose significant challenges. TikTok is emerging as a popular social media platform for dissemination of health and nutrition information. The aim of this study is to analyse patient-generated content on TikTok regarding dietary management post-UC surgery, in order to identify recurring themes and highlight patient priorities. Methods: Relevant TikTok videos were identified through a systematic search. Search terms were developed by combining ‘diet UC’ or ‘nutrition UC’ with common UC surgical procedures. From each search term, the first 10 videos were screened. If a search produced fewer than 10 results, all identified videos were retrieved. Inclusion criteria were videos in English, and a strong indication that the content creator was diagnosed with UC and had undergone relevant surgery, and was providing nutrition recommendations. Thematic analysis of video transcripts was conducted using Braun and Clarke’s framework to identify common themes. Results: A total of 89 videos, created between 2021 and 2024, were found on the initial search, of which 12 duplicates were removed, and 77 videos were screened. Sixteen English language videos met the inclusion criteria and were analysed. Thematic analysis identified three overarching themes: (1) adaptive dietary progression in the post-surgical period, where patients described a phased approach to reintroducing foods post-surgery; (2) personalisation of diet, highlighting individualised strategies for symptom and hydration management; and (3) Emotional and social impact of dietary restrictions and modifications, including fear of food and social isolation. Conclusions: This thematic analysis offers an insight into how patients navigate the complex management of diet following UC surgery. It is important for clinicians to discuss the dietary information and online content patients are exposed to in relation to their condition. Additionally, clinical practice should evolve to embrace patient-centred, multidisciplinary approaches that validate lived experience, ensure consistent dietary guidance, and address the psychological burden of dietary restriction. Full article

(This article belongs to the Section Nutritional Policies and Education for Health Promotion)

► Show Figures

Figure 1

18 pages, 446 KB

Open AccessArticle

TikTok and Instagram as Putative Social Media in Promoting Healthy Eating Habits in Youths At-Risk for Eating/Feeding Disorders and Body Image Dissatisfaction

by Laura Orsolini, Giulio Longo, Teresa Cantarini, Salvatore Reina and Umberto Volpe

Brain Sci. 2026, 16(4), 379; https://doi.org/10.3390/brainsci16040379 - 30 Mar 2026

Viewed by 1304

Abstract

Background: The widespread use of Social Networks (SNS), particularly among youths, could promote Feeding and Eating Disorders (FEDs), but could also be a tool for implementing FED prevention strategies. This study aimed to identify which SNS could be most effective for implementing [...] Read more.

Background: The widespread use of Social Networks (SNS), particularly among youths, could promote Feeding and Eating Disorders (FEDs), but could also be a tool for implementing FED prevention strategies. This study aimed to identify which SNS could be most effective for implementing primary and secondary FED prevention. Methodology: A cross-sectional study was conducted via an Italian population-based survey, distributed using a snowball sampling strategy. The survey included 283 participants aged 18–35 by using the Bergen Social Media Addiction Scale (BSMAS), the SCOFF screening tool for FEDs, items from the Body Uneasiness Test (BUT), and the Mukbang Addiction Scale (MAS). Results: The sample was predominantly female (69.3%). Participants screening positive on the SCOFF were more frequently TikTok users. Stepwise logistic regression analysis showed that TikTok use was associated with SCOFF positivity (OR = 1.9) and body image concerns (e.g., spending a lot of time in front of the mirror; OR = 1.9). Instagram use was associated with body image dissatisfaction (OR = 3.9). In the overall sample, the likelihood of screening positive on the SCOFF was associated with TikTok use (OR = 1.7), higher BSMAS scores (OR = 1.1), exposure to body positivity/neutrality content (OR = 1.9), and watching Mukbang videos (OR = 1.8). Conclusions: TikTok and, to a lesser extent, Instagram appear to be widely used by young individuals vulnerable to FEDs and body image dissatisfaction. These platforms may therefore represent strategic channels for delivering educational and preventive interventions targeting eating behaviors and body image among young people. Further longitudinal research is needed to clarify causal relationships and evaluate the effectiveness of SNS-based prevention strategies. Full article

(This article belongs to the Special Issue Emerging Trends in Youth Mental Health)

► Show Figures

Figure 1

19 pages, 7295 KB

Open AccessArticle

Video Identifying and Eraser: Use Multi-Task Cascaded Convolutional Neural Network to Enhance Safety in a Text-to-Video Diffusion Model

by Shuang Lin, Ranran Zhou and Yong Wang

Appl. Sci. 2026, 16(6), 2995; https://doi.org/10.3390/app16062995 - 20 Mar 2026

Viewed by 451

Abstract

Current security solutions predominantly rely on cloud-based implementations, often neglecting computational resource constraints and operational efficiency. While contemporary methodologies typically require additional training, the few that operate without retraining frequently yield suboptimal performance. To address these limitations, this work leverages a pre-trained MTCNN [...] Read more.

Current security solutions predominantly rely on cloud-based implementations, often neglecting computational resource constraints and operational efficiency. While contemporary methodologies typically require additional training, the few that operate without retraining frequently yield suboptimal performance. To address these limitations, this work leverages a pre-trained MTCNN architecture to detect faces of copyright-protected individuals. We construct a facial landmark database comprising five critical fiducial points, which serves as a supplementary module integrated into the stable diffusion framework, enabling real-time security filtering for synthesized video content. The proposed system utilizes MTCNN models pre-trained in the cloud to build a repository of copyrighted facial signatures, generating a geometric parameter database of facial landmarks. This database, coupled with a parallel verification unit, functions as a plugin within the standard Stable Diffusion pipeline. By leveraging Stable Diffusion’s native decoder, we decode stochastic frames from the U-Net latent representations and perform real-time comparative analysis to identify potential copyright violations in generated video sequences. Upon detecting an infringement, an on-screen display (OSD) alert notifies the user and immediately halts the text-to-video (T2V) generation process. Experimental evaluations demonstrate that our framework effectively mitigates the resource constraints and latency issues inherent in edge deployment scenarios of prior security implementations. Leveraging MTCNN’s proven robustness and extensive edge compatibility for facial recognition, the proposed detection and obfuscation plugin integrates seamlessly with Stable Diffusion while preserving generation quality. Full article

(This article belongs to the Special Issue Applied Multimodal AI: Methods and Applications Across Domains)

► Show Figures

Figure 1

31 pages, 2433 KB

Open AccessArticle

Quality vs. Populism in Short-Video Political Communication: A Multimodal Study of TikTok

by Alicia Rodas-Coloma, Marcos Cabezas-González, Sonia Casillas-Martín and Pedro Nevado-Batalla Moreno

Journal. Media 2026, 7(1), 46; https://doi.org/10.3390/journalmedia7010046 - 25 Feb 2026

Viewed by 1820

Abstract

The article examines how framing and actor identity structure attention in short-video politics using a country-level corpus from Ecuador. It assembles 4612 public TikTok videos from official accounts and politically salient hashtags, extracts multimodal text via automatic speech recognition and on-screen OCR, and [...] Read more.

The article examines how framing and actor identity structure attention in short-video politics using a country-level corpus from Ecuador. It assembles 4612 public TikTok videos from official accounts and politically salient hashtags, extracts multimodal text via automatic speech recognition and on-screen OCR, and constructs two continuous indices: a quality index (programmatic, efficacy-oriented content) and a populism index (antagonistic, people-versus-elite cues). Engagement is modeled as a fractional response (binomial GLM with logit link), with robustness checks using OLS on logit(ER) and Poisson counts with an offset for log(plays + 1). Models include affect (positive sentiment and anger), hour/day controls, and actor fixed effects (leader, creator, institution, party, and media). The indices display construct validity: quality aligns with positive/joyful tone and populism with anger. Net of controls, populism is positively and consistently associated with engagement across estimators; quality is small and often null or negative. Effects are heterogeneous: leaders gain under both frames, creators primarily under populism, and media modestly under populism, while institutions face penalties under both, and parties show limited returns. Monthly series reveal event-linked intensification of populism, and hashtag networks are modular, mapping onto institutional, partisan, and creator ecosystems. A design analysis identifies a non-populist pathway—benefit-first micro-explanations, concise captions, targeted hashtags, and joyful/efficacy affect—that raises engagement without antagonism. The study contributes a reproducible, open-source pipeline for survey-free, multimodal framing measurement and clarifies how persona × frame interactions and meso-level discursive structure jointly organize attention in short-video politics. Full article

► Show Figures

Figure 1

25 pages, 1558 KB

Open AccessArticle

Towards Scalable Monitoring: An Interpretable Multimodal Framework for Migration Content Detection on TikTok Under Data Scarcity

by Dimitrios Taranis, Gerasimos Razis and Ioannis Anagnostopoulos

Electronics 2026, 15(4), 850; https://doi.org/10.3390/electronics15040850 - 17 Feb 2026

Viewed by 797

Abstract

Short-form video platforms such as TikTok (TikTok Pte. Ltd., Singapore) host large volumes of user-generated, often ephemeral, content related to irregular migration, where relevant cues are distributed across visual scenes, on-screen text, and multilingual captions. Automatically identifying migration-related videos is challenging due to [...] Read more.

Short-form video platforms such as TikTok (TikTok Pte. Ltd., Singapore) host large volumes of user-generated, often ephemeral, content related to irregular migration, where relevant cues are distributed across visual scenes, on-screen text, and multilingual captions. Automatically identifying migration-related videos is challenging due to this multimodal complexity and the scarcity of labeled data in sensitive domains. This paper presents an interpretable multimodal classification framework designed for deployment under data-scarce conditions. We extract features from platform metadata, automated video analysis (Google Cloud Video Intelligence), and Optical Character Recognition (OCR) text, and compare text-only, OCR-only, and vision-only baselines against a multimodal fusion approach using Logistic Regression, Random Forest, and XGBoost. In this pilot study, multimodal fusion consistently improves class separation over single-modality models, achieving an F1-score of 0.92 for the migration-related class under stratified cross-validation. Given the limited sample size, these results are interpreted as evidence of feature separability rather than definitive generalization. Feature importance and SHAP analyses identify OCR-derived keywords, maritime cues, and regional indicators as the most influential predictors. To assess robustness under data scarcity, we apply SMOTE to synthetically expand the training set to 500 samples and evaluate performance on a small held-out set of real videos, observing stable results that further support feature-level robustness. Finally, we demonstrate scalability by constructing a weakly labeled corpus of 600 videos using the identified multimodal cues, highlighting the suitability of the proposed feature set for weakly supervised monitoring at scale. Overall, this work serves as a methodological blueprint for building interpretable multimodal monitoring pipelines in sensitive, low-resource settings. Full article

(This article belongs to the Special Issue Multimodal Learning for Multimedia Content Analysis and Understanding)

► Show Figures

Figure 1

22 pages, 1659 KB

Open AccessArticle

Lightweight Depression Detection Using 3D Facial Landmark Pseudo-Images and CNN-LSTM on DAIC-WOZ and E-DAIC

by Achraf Jallaglag, My Abdelouahed Sabri, Ali Yahyaouy and Abdellah Aarab

BioMedInformatics 2026, 6(1), 8; https://doi.org/10.3390/biomedinformatics6010008 - 4 Feb 2026

Cited by 1 | Viewed by 2060

Abstract

Background: Depression is a common mental disorder, and early and objective diagnosis of depression is challenging. New advances in deep learning show promise for processing audio and video content when screening for depression. Nevertheless, the majority of current methods rely on raw video [...] Read more.

Background: Depression is a common mental disorder, and early and objective diagnosis of depression is challenging. New advances in deep learning show promise for processing audio and video content when screening for depression. Nevertheless, the majority of current methods rely on raw video processing or multimodal pipelines, which are computationally costly and challenging to understand and create privacy issues, restricting their use in actual clinical settings. Methods: Based solely on spatiotemporal 3D face landmark representations, we describe a unique, totally visual, and lightweight deep learning approach to overcome these constraints. In this paper we introduce, for the first time, a pure visual deep learning framework, based on spatiotemporal 3D facial landmarks extracted from clinical interview videos contained in the DAIC-WOZ and Extended DAIC-WOZ (E-DAIC) datasets. Our method does not use raw video or any type of semi-automated multimodal fusion. Whereas raw video streaming can be computationally expensive and is not well suited to investigating specific variables, we first take a temporal series of 3D landmarks, convert them to pseudo-images (224 × 224 × 3), and then use them within a CNN-LSTM framework. Importantly, CNN-LSTM provides the ability to analyze the spatial configuration and temporal dimensions of facial behavior. Results: The experimental results indicate macro-average F1 scores of 0.74 on DAIC-WOZ and 0.762 on E-DAIC, demonstrating robust performance under heavy class imbalances, with a variability of ±0.03 across folds. Conclusion: These results indicate that landmark-based spatiotemporal modeling represents the future of lightweight, interpretable, and scalable automatic depression detection. Second, our results suggest exciting opportunities for completely embedding ADI systems within the framework of real-world MHA. Full article

► Show Figures

Graphical abstract

24 pages, 2996 KB

Open AccessArticle

What Does Bullet Screen Bring to Video Platform? A Theoretical Analysis Comparing Different Bullet Screen Modes

by Xingzhen Zhu and Li Li

J. Theor. Appl. Electron. Commer. Res. 2025, 20(4), 338; https://doi.org/10.3390/jtaer20040338 - 2 Dec 2025

Viewed by 1208

Abstract

Many video platforms (e.g., TikTok and Bilibili) choose to provide bullet screens on their video content. With the different types of bullet screen features, platforms face the challenge of choosing an appropriate bullet screen strategy, especially when consumers have different preferences for bullet [...] Read more.

Many video platforms (e.g., TikTok and Bilibili) choose to provide bullet screens on their video content. With the different types of bullet screen features, platforms face the challenge of choosing an appropriate bullet screen strategy, especially when consumers have different preferences for bullet screens. To address this challenge, this paper constructs a game-theoretic model to analyze the optimal bullet screen strategy for video platforms with two-sided market characteristics. Although there are some arguments that bullet screens can be detrimental to the platform’s advertising business, our study shows that when the bullet screen feature can attract more consumers, it is beneficial for both the platform and the advertisers. Additionally, we found that as consumers’ attention levels toward bullet screens increase or the proportion of bullet screen preference consumers rises, the video platform will enhance the quality of bullet screens provided to consumers and raise advertising pricing for advertisers. However, the platform’s profits do not necessarily increase accordingly and are also influenced by the platform’s bullet screen cost coefficient. Our comparative analysis of the three bullet screen models reveals that when consumers in the market are ad-fatigued, the model that allows bullet screens to cover ads is the optimal choice for the platform, and when there are differences in the cross-side network effects of advertiser to consumers, the model that maximizes the platform’s profits depends on the size of the cross-side network effects of advertiser to consumers. Our study provides important managerial insights for video platforms, especially on how to provide bullet screens under two-sided market structures. In future research, we will strive to better integrate consumer attention theory with theoretical modeling. Full article

► Show Figures

Figure 1

28 pages, 2524 KB

Open AccessArticle

A Multimodal Analysis of Automotive Video Communication Effectiveness: The Impact of Visual Emotion, Spatiotemporal Cues, and Title Sentiment

by Yawei He, Zijie Feng and Wen Liu

Electronics 2025, 14(21), 4200; https://doi.org/10.3390/electronics14214200 - 27 Oct 2025

Viewed by 1538

Abstract

To quantify the communication effectiveness of automotive online videos, this study constructs a multimodal deep learning framework. Existing research often overlooks the intrinsic and interactive impact of textual and dynamic visual content. To bridge this gap, our framework conducts an integrated analysis of [...] Read more.

To quantify the communication effectiveness of automotive online videos, this study constructs a multimodal deep learning framework. Existing research often overlooks the intrinsic and interactive impact of textual and dynamic visual content. To bridge this gap, our framework conducts an integrated analysis of both the textual (titles) and visual (frames) dimensions of videos. For visual analysis, we introduce FER-MA-YOLO, a novel facial expression recognition model tailored to the demands of computational communication research. Enhanced with a Dense Growth Feature Fusion (DGF) module and a multiscale Dilated Attention Module (MDAM), it enables accurate quantification of on-screen emotional dynamics, which is essential for testing our hypotheses on user engagement. For textual analysis, we employ a BERT model to quantify the sentiment intensity of video titles. Applying this framework to 968 videos from the Bilibili platform, our regression analysis—which modeled four distinct engagement dimensions (reach, support, discussion, and interaction) separately, in addition to a composite effectiveness score—reveals several key insights: emotionally charged titles significantly boost user interaction; visually, the on-screen proportion of human elements positively predicts engagement, while excessively high visual information entropy weakens it. Furthermore, neutral expressions boost view counts, and happy expressions drive interaction. This study offers a multimodal computational framework that integrates textual and visual analysis and provides empirical, data-driven insights for optimizing automotive video content strategies, contributing to the growing application of computational methods in communication research. Full article

(This article belongs to the Special Issue Advances in Data-Driven Artificial Intelligence)

► Show Figures

Figure 1

11 pages, 731 KB

Open AccessSystematic Review

Is YouTube™ a Reliable Source of Information for the Current Use of HIPEC in the Treatment of Ovarian Cancer?

by Francesco Mezzapesa, Elisabetta Pia Bilancia, Margarita Afonina, Stella Di Costanzo, Elena Masina, Pierandrea De Iaco and Anna Myriam Perrone

Cancers 2025, 17(19), 3222; https://doi.org/10.3390/cancers17193222 - 2 Oct 2025

Cited by 1 | Viewed by 1014

Abstract

Introduction: YouTube™ is a widely accessible platform with unfiltered medical information. This study aimed to evaluate the educational value and reliability of YouTube™ videos on Hyperthermic Intraperitoneal Chemotherapy (HIPEC) for advanced epithelial ovarian cancer treatment. Methods: YouTube™ videos were searched using [...] Read more.

Introduction: YouTube™ is a widely accessible platform with unfiltered medical information. This study aimed to evaluate the educational value and reliability of YouTube™ videos on Hyperthermic Intraperitoneal Chemotherapy (HIPEC) for advanced epithelial ovarian cancer treatment. Methods: YouTube™ videos were searched using the keywords “ovarian cancer”, “debulking surgery”, “hyperthermic”, and “HIPEC”. Patient Education Materials Assessment Tool for Audiovisual Content (PEMAT A/V) score, DISCERN, Misinformation Scale, and the Global Quality Scale (GQS) were employed to assess the clarity, quality, and reliability of the information presented. Results: Of the 150 YouTube™ videos screened, 71 were suitable for analysis and categorized by target audience (general public vs. healthcare workers). Most (57, 80.2%) were uploaded after the “Ov-HIPEC” trial (18 January 2018), with a trend toward more videos for healthcare workers (p = 0.07). Videos for the general public were shorter (p < 0.001) but received more views (p = 0.06) and likes (p = 0.09), though they were of lower quality. The DISCERN score averaged 50 (IQR: 35–60), with public-targeted videos being less informative (p < 0.001), a trend mirrored by the Misinformation Scale (p < 0.001) and GQS (p < 0.001). The PEMAT A/V scores showed 80% Understandability (IQR: 62–90) and 33% Actionability (IQR: 25–100), with no significant difference between groups (p = 0.15, p = 0.4). Conclusions: While YouTube™ provides useful information for healthcare professionals, it cannot be considered a reliable source for patients seeking information on HIPEC for ovarian cancer. Many videos contribute to misinformation by not properly explaining treatment indications, timing, adverse effects, multimodal approaches, or clinical trial findings. Full article

(This article belongs to the Section Cancer Informatics and Big Data)

► Show Figures

Figure 1

20 pages, 2745 KB

Open AccessArticle

Uses of Metaverse Recordings in Multimedia Information Retrieval

by Patrick Steinert, Stefan Wagenpfeil, Ingo Frommholz and Matthias L. Hemmje

Multimedia 2025, 1(1), 2; https://doi.org/10.3390/multimedia1010002 - 10 Aug 2025

Cited by 1 | Viewed by 1526

Abstract

Metaverse Recordings (MVRs), screen recordings of user experiences in virtual environments, represent a mostly underexplored field. This article addresses the integration of MVR and Multimedia Information Retrieval (MMIR). Unlike conventional media, MVRs can include additional streams of structured data, such as Scene Raw [...] Read more.

Metaverse Recordings (MVRs), screen recordings of user experiences in virtual environments, represent a mostly underexplored field. This article addresses the integration of MVR and Multimedia Information Retrieval (MMIR). Unlike conventional media, MVRs can include additional streams of structured data, such as Scene Raw Data (SRD) and Peripheral Data (PD), which capture graphical rendering states and user interactions. We explore the technical facets of recordings in the Metaverse, detailing diverse methodologies and their implications for MVR-specific Multimedia Information Retrieval. Our discussion not only highlights the unique opportunities of MVR content analysis, but also examines the challenges they pose to conventional MMIR paradigms. It addresses the key challenges around the semantic gap in existing content analysis tools when applied to MVRs and the high computational cost and limited recall of video-based feature extraction. We present a model for MVR structure, a prototype recording system, and an evaluation framework to assess retrieval performance. We collected a set of 111 MVRs to study and evaluate the intricacies. Our findings show that SRD and PD provide significant, low-cost contributions to retrieval accuracy and scalability, and support the case for integrating structured interaction data into future MMIR architectures. Full article

► Show Figures

Figure 1

32 pages, 1280 KB

Open AccessReview

Effectiveness of Technology-Based Interventions in Promoting Lung Cancer Screening Uptake and Decision-Making Among Patients

by Safa Elkefi, Nelson Gaillard and Rongyi Wu

Int. J. Environ. Res. Public Health 2025, 22(8), 1250; https://doi.org/10.3390/ijerph22081250 - 9 Aug 2025

Cited by 3 | Viewed by 2392

Abstract

This study reviews how technology-based interventions have been designed and implemented to promote lung cancer screening (LCS), support shared decision-making, and enhance patient engagement. A systematic search of six databases in February 2025 identified 28 eligible studies published between 2014 and 2025. Most [...] Read more.

This study reviews how technology-based interventions have been designed and implemented to promote lung cancer screening (LCS), support shared decision-making, and enhance patient engagement. A systematic search of six databases in February 2025 identified 28 eligible studies published between 2014 and 2025. Most interventions were home-based and self-guided, including videos, websites, mobile apps, telehealth, and patient portal messages. Common features included risk calculators, multimedia content, simplified navigation, and integration with electronic medical records. These tools aim to raise awareness, improve informed decision-making, and support smoking cessation. While 82% of studies reported positive effects on knowledge and decision-making confidence, only some showed an increased screening uptake. Key barriers included limited internet access, low digital literacy, provider time constraints, fear or anxiety, and concerns about radiation or cost. Despite these challenges, digital tools show promise in advancing LCS promotion. Their effectiveness, however, depends on thoughtful design, integration into clinical workflows, and equitable access. Future work should address structural and contextual challenges to scale digital health solutions and reduce disparities in screening participation. This review identifies both the potential and limitations of current interventions and offers guidance for enhancing impact through targeted, accessible, and user-informed approaches. Full article

(This article belongs to the Section Infectious Diseases, Chronic Diseases, and Disease Prevention)

► Show Figures

Figure 1

22 pages, 7735 KB

Open AccessArticle

Visual Perception of Peripheral Screen Elements: The Impact of Text and Background Colors

by Snježana Ivančić Valenko, Marko Čačić, Ivana Žiljak Stanimirović and Anja Zorko

Appl. Sci. 2025, 15(14), 7636; https://doi.org/10.3390/app15147636 - 8 Jul 2025

Cited by 1 | Viewed by 3013

Abstract

Visual perception of screen elements depends on their color, font, and position in the user interface design. Objects in the central part of the screen are perceived more easily than those in the peripheral areas. However, the peripheral space is valuable for applications [...] Read more.

Visual perception of screen elements depends on their color, font, and position in the user interface design. Objects in the central part of the screen are perceived more easily than those in the peripheral areas. However, the peripheral space is valuable for applications like advertising and promotion and should not be overlooked. Optimizing the design of elements in this area can improve user attention to peripheral visual stimuli during focused tasks. This study aims to evaluate how different combinations of text and background color affect the visibility of moving textual stimuli in the peripheral areas of the screen, while attention is focused on a central task. This study investigates how background color, combined with white or black text, affects the attention of participants. It also identifies which background color makes a specific word most noticeable in the peripheral part of the screen. We designed quizzes to present stimuli with black or white text on various background colors in the peripheral regions of the screen. The background colors tested were blue, red, yellow, green, white, and black. While saturation and brightness were kept constant, the color tone was varied. Among ten combinations of background and text color, we aimed to determine the most noticeable combination in the peripheral part of the screen. The combination of white text on a blue background resulted in the shortest detection time (1.376 s), while black text on a white background achieved the highest accuracy rate at 79%. The results offer valuable insights for improving peripheral text visibility in user interfaces across various visual communication domains such as video games, television content, and websites, where peripheral information must remain noticeable despite centrally focused user attention and complex viewing conditions. Full article

► Show Figures

Figure 1

13 pages, 1932 KB

Open AccessArticle

Evaluation of the Quality and Educational Value of YouTube Videos on Class IV Resin Composite Restorations

by Rashed A. AlSahafi, Hesham A. Alhazmi, Israa Alkhalifah, Danah Albuhmdouh, Malik J. Farraj, Abdullah Alhussein and Abdulrahman A. Balhaddad

Dent. J. 2025, 13(7), 298; https://doi.org/10.3390/dj13070298 - 30 Jun 2025

Cited by 4 | Viewed by 1423

Abstract

Objectives: The increasing reliance on online platforms for dental education necessitates an assessment of the quality and reliability of available resources. This study aimed to evaluate YouTube videos as educational tools for Class IV resin composite restorations. Methods: The first 100 YouTube [...] Read more.

Objectives: The increasing reliance on online platforms for dental education necessitates an assessment of the quality and reliability of available resources. This study aimed to evaluate YouTube videos as educational tools for Class IV resin composite restorations. Methods: The first 100 YouTube videos were screened, and 73 met the inclusion criteria. The videos were evaluated using the Video Information and Quality Index (VIQI) and specific content criteria derived from the dental literature. Videos with a score below the mean were identified as low-content videos. Results: No significant differences were noted between high- and low-content videos when examining the number of views, number of likes, duration, days since upload, viewing rate, interaction index, and number of subscribers (p > 0.05). The high-content videos demonstrated higher mean values compared with the low-content videos in flow (4.11 vs. 3.21; p < 0.0001), accuracy (4.07 vs. 3.07; p < 0.0001), quality (4 vs. 2.66; p < 0.0001), and precision (4.16 vs. 2.86; p < 0.0001). The overall VIQI score was significantly higher (p < 0.0001) for high-content videos (Mean 16.34; SD 2.46) compared with low-content videos (Mean 11.79; SD 2.96). For content score, high-content videos (Mean 9.36; SD 1.33) had a higher score (p < 0.0001) than low-content videos (Mean 4.90; SD 2.04). The key areas lacking sufficient coverage included occlusion, shade selection, and light curing techniques. Conclusions: While a significant portion of YouTube videos provided high-quality educational content, notable deficiencies were identified. This analysis serves as a call to action for both content creators and educational institutions to prioritize the accuracy and completeness of online dental education. Full article

(This article belongs to the Special Issue Dental Education: Innovation and Challenge)

► Show Figures

Figure 1

Search Results (70)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (70)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI